systems online · Gurgaon, India

Sarthak Chhabra

Technical Lead at Stashfin — I architect the platforms that put AI agents into production: routing engines, RAG pipelines and no-code tooling that turn weeks of engineering into minutes of configuration.

44Mtokens/day 300K+LLM req/day 15–20msrouting latency 7+years shipping

01 · proof of work

Numbers from production, not slides.

Every metric below is live or was measured in a shipped system. This is what my work does when real users hit it.

stashfin · live
0M

LLM tokens processed every day by the customer-support agent I architected — ~30,000 messages across 6,000+ daily users.

zupee · llm router
0K+

Requests/day through a centralized LLM routing engine — 15–20ms latency, 99% uptime, multi-provider fallbacks & circuit breakers.

retention
0%

D15 retention on an AI companion chatbot — LangGraph multi-agent planning + RAG memory.

velocity
0×

Faster content production — promo generation cut from one week to 3 hours.

delivery
days → minutes

Agent delivery time after the no-code platform — prompt, tools, model & channel, all self-serve.

02 · systems

Things I've architected.

Platforms and infrastructure designed so that other people — engineers and non-engineers alike — can ship intelligence.

SYS-002 / ZUPEE

Centralized LLM Routing Engine

One gateway for every LLM call in the company — budget tracking, rate limiting, multi-provider fallbacks and circuit breakers so product teams never think about provider outages.

300K+ req/day · 15–20ms · 99% uptime

NodeJSPostgresCircuit BreakersMulti-provider
SYS-003 / ZUPEE

AI Companion Chatbot

LangGraph multi-agent architecture for response planning and conflict resolution, RAG memory on Qdrant, and human-like dynamic response timing.

46% D15 retention

LangGraphQdrantRAGMongoDB
SYS-004 / STASHFIN

Dynamic Tool Framework

Turns any REST API into an agent-callable tool: configure the request, fire a live test call to capture the real response shape, annotate keys — the LLM tool schema writes itself. Zero hand-written integrations.

any API → agent tool, no code

Tool CallingSchema InferenceREST
SYS-005 / STASHFIN

RAG Ingestion Pipeline + Cross-Channel Memory

End-to-end knowledge pipeline — upload, chunk, embed, store — plus a contact system that unifies a user's email, phone and Slack identities into one persistent memory, so agents never start cold across channels.

custom knowledge bases, zero engineering support

EmbeddingsVector StoreSlackTelegram
SYS-006 / ZUPEE

Automated Promo Generation

Parallel processing of 50–100 microseries for ad creative — accelerating A/B testing cycles and campaign deployment, with a no-code bot management console for product managers on top.

1 week → 3 hours production time

Parallel ProcessingNo-code ConsoleGenAI

03 · timeline

Seven years, six chapters.

FEB 2026 — PRESENT CURRENT

Technical Lead · Stashfin

NodeJS · TypeScript · AWS · Postgres · GenAI · Docker · Microservices

  • Architected a no-code AI agent platform — agent delivery cut from days to minutes; runs production agents handling ~30K messages & 44M tokens/day.
  • Built a dynamic tool framework turning any REST API into an agent tool via live schema capture.
  • Unified cross-channel user identity with persistent agent memory across email, phone & Slack.

JUL 2025 — FEB 2026

Technical Lead · Zupee

NodeJS · Postgres · Qdrant · MongoDB · LangGraph · GenAI

  • Built the centralized LLM routing engine: 300K+ req/day, 15–20ms, 99% uptime.
  • Shipped an AI companion with 46% D15 retention; led a team of 4 across multiple AI products.
  • Cut promo production from one week to 3 hours with automated generation.

MAR 2022 — APR 2025

Senior Software Engineer · Dresma AI

NodeJS · TypeScript · AWS · MongoDB · Kafka · Microservices

  • Cut response times 40% and lifted system efficiency 30% across heavy-computation services.
  • Designed near-fail-proof architectures: +50% stability, −30% processing time.
  • Mentored junior engineers — +20% delivery efficiency, −30% error rates.

MAY 2021 — MAR 2022

Full Stack Developer · Gigforce

NodeJS · AWS SQS · MongoDB · VueJS

  • Owned modules end-to-end: +25% stability, −40% peak-load latency, −30% query response time.

OCT 2020 — MAY 2021

Software Developer · Signcatch

ReactJS · Angular · SQL · PHP

  • Shipped across the full SDLC — −20% time-to-market, −15% churn.

JAN 2019 — OCT 2020

Associate Software Engineer · Bosch

Angular · C# · SQL

  • Cross-platform applications with rigorous version control — −25% ticket resolution time.

AUG 2015 — JUL 2019

B.E. Computer Science · Lovely Professional University

8.7 CGPA

04 · stack

Tools I think in.

AI / LLM Systems

LangGraphMulti-agent ArchitectureRAG PipelinesQdrantEmbeddingsTool CallingPrompt EngineeringLLM Routing & Fallbacks

Backend Engineering

Node.jsTypeScriptPostgresMongoDBKafkaMicroservicesREST APIsSystem Design

Infrastructure

AWSDockerSQSCI/CDRate LimitingCircuit BreakersObservability

Leadership & Product

Team of 4Architecture OversightMentorshipNo-code ToolingCross-team DeliveryVueJS / React / Angular

05 · about

The human behind the systems.

Portrait of Sarthak Chhabra

I'm Sarthak — a Technical Lead based in Gurgaon, India. My career has one through-line: removing the engineering bottleneck between an idea and a running system.

At Bosch I learned discipline. At startups I learned speed. At Dresma I learned to build things that don't fall over. And in the AI era, I've found the work I love most: platforms that let non-engineers deploy production-grade AI agents — routing engines that survive provider outages, RAG pipelines anyone can feed, tool frameworks that turn any API into an agent capability.

I lead small teams, I stay close to the code, and I measure my work in production numbers — not promises.

06 · contact

Let's build something
intelligent.

Whether it's AI infrastructure, agent platforms, or a hard scaling problem — my inbox is open.