Engineering
scalable systems.
I'm Vamsi Cheruku, a Backend Software Engineer with 3+ years of experience designing high-concurrency event-driven microservices. Currently pursuing an MS in Artificial Intelligence.

Engineering Philosophy
I build systems that solve problems at the infrastructure level—where design decisions directly dictate reliability, scalability, and performance.
Design for Failure
In a distributed environment, failures are guaranteed. Network latency, service crashes, thread starvation, and database bottlenecks are inevitably going to happen. My core philosophy is to design architectures that anticipate and absorb these failures gracefully.
This means keeping microservices single-responsibility, enforcing strict timeouts, implementing circuit breakers, and enforcing idempotent operations. I rely heavily on proper observability (logging, metrics, tracing) because you cannot fix what you cannot see.
Boring Technology is Good Technology
When reliability matters, I prefer battle-tested, "boring" technology. Tools like Java, Spring Boot, Kafka, and PostgreSQL have massive ecosystems, predictable failure modes at scale, and proven track records. Innovation should happen in the product logic, not in gambling on unproven infrastructure.
Current Explorations
Outside of my core Java/Kafka stack, my MS in Artificial Intelligence has pushed me toward the intersection of distributed systems and ML. I am currently exploring Kubernetes internals, Redis caching strategies at scale, and the backend infrastructure required for MLOps—specifically data pipelines, vector databases, and scalable inference APIs. Long-term, I want to build the platforms that allow ML models to run reliably for millions of users.
Experience
Professional timeline and education.
Software Developer
- —Core backend engineer on a greenfield distributed social communication platform built for ~20,000 employees using Java, Spring Boot, and Kafka.
- —Designed and maintained Java/Spring Boot microservices for a high-concurrency feed platform, applying reliability-first principles.
- —Built event-driven pipelines using Kafka to fan out post events to follower timelines asynchronously, supporting ~10,000 simulated concurrent users.
- —Owned Redis caching subsystem with a hybrid write-through/cache-aside strategy, reducing feed read latency and preventing cache stampede.
- —Designed RESTful APIs using OpenAPI/Swagger specifications for feed, user profile, and notification services.
- —Modeled relational data in MySQL with range partitioning; applied ArangoDB graph modeling for follower relationship traversals.
- —Refined Elasticsearch index mappings to improve relevance and low-latency delivery of search results.
Deep Dive: Thread Pool Exhaustion
Diagnosed a critical production issue where high API traffic caused request timeouts. I analyzed thread dumps and discovered threads were starving while waiting on downstream I/O. By implementing request batching (aggregating calls within a short time window), I significantly reduced thread blocking, stabilized the pool, and restored latency to baseline during peak loads.
Project Intern
- —Evaluated ArangoDB vs Neo4j for graph-heavy relationship workloads, informing the team's production architecture decision.
- —Built REST API endpoints within a Java Spring Boot microservices codebase.
- —Integrated Redis cache-aside layers for user metadata, reducing redundant database reads and improving endpoint response times.
MS in Artificial Intelligence
- —GPA: 3.74/4.00
Technical Capabilities
Skills and frameworks I work with daily.
Languages
- Java (17+)
- TypeScript
- JavaScript
- Python
- SQL
Backend Frameworks
- Spring Boot
- Apache Struts
- Express.js
- REST APIs
- Microservices
API Design & Testing
- RESTful APIs
- OpenAPI/Swagger
- JUnit 5
- Mockito
- Integration Testing
Databases
- MySQL
- PostgreSQL
- Redis
- ArangoDB
- DynamoDB
Messaging & EDA
- Apache Kafka
- Event-Driven Architecture
- Asynchronous Processing
Cloud & DevOps
- AWS (EC2, S3)
- Docker
- Kubernetes
- GitHub Actions CI/CD
Selected Work
Case studies from production systems.
TaraHub – Multi-Vendor Marketplace
A scalable multi-vendor commerce platform with an event-driven lifecycle pipeline ensuring downstream systems remain consistent without polling.
Key Deliverables
- —Built Java 17/Spring Boot backend services documented with OpenAPI/Swagger.
- —Implemented an event-driven order lifecycle pipeline using Kafka to propagate state transitions.
- —Containerized services with Docker and deployed on AWS EC2 with CI/CD.
Case Study: Decoupling with Kafka
In an e-commerce system, a single action like placing an order can trigger many downstream processes (payments, inventory, notifications). If implemented with synchronous REST calls, the order service becomes a central point of cascading failure. By introducing Kafka, I decoupled these services. The Order service simply publishes an "OrderCreated" event, and downstream services consume it asynchronously. This eliminated tight coupling, improved fault tolerance, and easily allowed for future pipeline extensions (like analytics) without modifying the core order flow.
SaamCars – Vehicle Dealership Platform
A comprehensive vehicle inventory management and booking workflow system with role-based access control and strict transactional consistency.
Key Deliverables
- —Designed a RESTful backend with Node.js and PostgreSQL.
- —Built concurrency-safe reservation logic with stock validation and timeout handling.
- —Integrated Stripe webhooks with signature verification for atomic payment states.
Let's talk systems.
I'm actively seeking new roles in Backend Engineering or Systems Architecture starting May 2026. My inbox is always open.
Send Email© 2026 Vamsi Cheruku.
Built with Next.js & Tailwind CSS.