Systems Architect & Performance Engineer

Engineering
scalable systems.

I'm Vamsi Cheruku, a Backend Software Engineer with 3+ years of experience designing high-concurrency event-driven microservices. Currently pursuing an MS in Artificial Intelligence.

View Projects Read Resume

Engineering Philosophy

I build systems that solve problems at the infrastructure level—where design decisions directly dictate reliability, scalability, and performance.

Design for Failure

In a distributed environment, failures are guaranteed. Network latency, service crashes, thread starvation, and database bottlenecks are inevitably going to happen. My core philosophy is to design architectures that anticipate and absorb these failures gracefully.

This means keeping microservices single-responsibility, enforcing strict timeouts, implementing circuit breakers, and enforcing idempotent operations. I rely heavily on proper observability (logging, metrics, tracing) because you cannot fix what you cannot see.

Boring Technology is Good Technology

When reliability matters, I prefer battle-tested, "boring" technology. Tools like Java, Spring Boot, Kafka, and PostgreSQL have massive ecosystems, predictable failure modes at scale, and proven track records. Innovation should happen in the product logic, not in gambling on unproven infrastructure.

Current Explorations

Outside of my core Java/Kafka stack, my MS in Artificial Intelligence has pushed me toward the intersection of distributed systems and ML. I am currently exploring Kubernetes internals, Redis caching strategies at scale, and the backend infrastructure required for MLOps—specifically data pipelines, vector databases, and scalable inference APIs. Long-term, I want to build the platforms that allow ML models to run reliably for millions of users.

Experience

Professional timeline and education.

07/2022 – 07/2024

Software Developer

Zoho Corporation

Chennai, India

—Core backend engineer on a greenfield distributed social communication platform built for ~20,000 employees using Java, Spring Boot, and Kafka.
—Designed and maintained Java/Spring Boot microservices for a high-concurrency feed platform, applying reliability-first principles.
—Built event-driven pipelines using Kafka to fan out post events to follower timelines asynchronously, supporting ~10,000 simulated concurrent users.
—Owned Redis caching subsystem with a hybrid write-through/cache-aside strategy, reducing feed read latency and preventing cache stampede.
—Designed RESTful APIs using OpenAPI/Swagger specifications for feed, user profile, and notification services.
—Modeled relational data in MySQL with range partitioning; applied ArangoDB graph modeling for follower relationship traversals.
—Refined Elasticsearch index mappings to improve relevance and low-latency delivery of search results.

Deep Dive: Thread Pool Exhaustion

Diagnosed a critical production issue where high API traffic caused request timeouts. I analyzed thread dumps and discovered threads were starving while waiting on downstream I/O. By implementing request batching (aggregating calls within a short time window), I significantly reduced thread blocking, stabilized the pool, and restored latency to baseline during peak loads.

02/2022 – 07/2022

Project Intern

Zoho Corporation

Chennai, India

—Evaluated ArangoDB vs Neo4j for graph-heavy relationship workloads, informing the team's production architecture decision.
—Built REST API endpoints within a Java Spring Boot microservices codebase.
—Integrated Redis cache-aside layers for user metadata, reducing redundant database reads and improving endpoint response times.

08/2024 – 05/2026

MS in Artificial Intelligence

Saint Louis University

Saint Louis, MO, USA

—GPA: 3.74/4.00

Technical Capabilities

Skills and frameworks I work with daily.

Languages

Java (17+)
TypeScript
JavaScript
Python
SQL

Backend Frameworks

Spring Boot
Apache Struts
Express.js
REST APIs
Microservices

API Design & Testing

RESTful APIs
OpenAPI/Swagger
JUnit 5
Mockito
Integration Testing

Databases

MySQL
PostgreSQL
Redis
ArangoDB
DynamoDB

Messaging & EDA

Apache Kafka
Event-Driven Architecture
Asynchronous Processing

Cloud & DevOps

AWS (EC2, S3)
Docker
Kubernetes
GitHub Actions CI/CD

Projects

Case studies from production-grade systems.

AI Video Generation Platform

Agent Motif

A local-first AI platform converting plain-English prompts into fully rendered MP4 videos without timeline editors or manual asset work. A multi-agent pipeline handles story planning, asset generation, motion synthesis, and final rendering end-to-end.

RemotionRedisBullMQSQLite/PrismaClaude APIOpenAI APIElevenLabsFFmpegMCP

Key Deliverables

—Agent platform with guardrails: runs as a structured multi-agent chain (Planning -> Asset gathering -> Motion direction -> TSX codegen -> Render), with authorization boundaries, per-job audit trails, and cost-gated degradation.
—Durable orchestration: Redis + BullMQ functions as a lightweight durable workflow engine — each job carries full state, survives process restarts, and retries idempotently with per-step status tracking.
—Strict inter-agent contract: Storyboard JSON (Zod-validated) is the rigid schema binding all agents, enforcing system-wide consistency and making schema drift structurally impossible.
—MCP & multi-provider LLM strategy: Claude Opus drives story planning; Claude Sonnet handles critique and codegen; DALL-E 3 and ElevenLabs are called via MCP-style tool interfaces.

Case Study: Durable Queue State Engines

Integrating dynamic image, audio, and code-generation models into an end-to-end rendering pipeline is prone to transient service failures and high LLM token costs. By leveraging Redis and BullMQ, we created a lightweight stateful workflow engine. If a rendering step fails, the queue maintains the job state and retries the specific step idempotently, saving compute and API costs.

View Source

Local-first Content Agent

Agent Echo

An autonomous, local-first agent that turns a developer's actual daily work into ready-to-review LinkedIn content.

PythonTypeScriptReactRemotionNode.jsSQLiteClaude APIElevenLabsManimFFmpeg

Key Deliverables

—Multi-source activity aggregation: pulls GitHub commits, Notion notes, browser history, and local file-system changes into a unified daily activity log.
—LLM-driven synthesis: Claude condenses a day's raw, heterogeneous activity into structured highlights, filtering post-worthy work from routine noise.
—Agentic drafting pipeline: LangGraph orchestrates a stateful multi-step graph (synthesis -> drafting -> self-critique/revision) to produce polished post drafts.
—Human-in-the-loop publishing: A Telegram bot delivers drafts for review, edit, and approval, with a scheduler handling timed publishing once approved.

Case Study: LangGraph Stateful Orchestration

Transforming raw daily activity logs into engaging narratives requires multiple loops of refinement (synthesis, drafting, reflection, and critique). Instead of using a single-shot LLM prompt, we model the workflow as a stateful graph in LangGraph. This allows nodes to reflect on their own outputs and iterate until they meet quality criteria.

View Source

Full-Stack Development

SaamCars – Vehicle Dealership Platform

A vehicle dealership backend where dealers manage inventory and buyers book vehicles — requiring strict concurrency control and real-money payment handling via Stripe.

Node.jsTypeScriptPostgreSQLStripeDocker

Key Deliverables

—Authorization primitives: middleware-level RBAC separates dealer and buyer capabilities at the API layer — making unauthorized access structurally hard.
—Concurrency-safe reservations: reservation logic enforces stock validation and timeout handling to prevent double-bookings when two buyers attempt to reserve the same vehicle.
—Payment integrity: Stripe webhooks use signature verification and idempotent event handling to atomically synchronize payment events with booking state.
—Containerized environment: Docker Compose packages the API and PostgreSQL together for consistent local development and deployment parity.

View Source Live Demo