I'm Kasyap, an AI Engineer based in San Jose, CA with 4+ years of experience building production AI systems for healthcare and enterprise.

Most of my work has been at the intersection of what AI can do and what non-technical teams actually need. At Sikka.ai, I built a no-code AI platform that lets healthcare providers spin up production applications without writing a line of code, which cut prototype delivery from weeks to minutes.

On top of that platform, I built RAG-powered chatbots and AI agents that handle the queries that used to pile up in support queues. The chatbot I shipped now serves over 10,000 users and gets them answers in seconds instead of days. The agents go a step further: they reason across tools and data sources to surface insights through a simple conversation.

I also work on the model side: fine-tuning LLaMA with PEFT and QLoRA for text-to-SQL, integrating feature stores to cut API latency in half, and building forecasting models that improved retention for dental practices by 70%.

When I'm not building, I'm usually thinking about evaluation and how to actually know if an AI system is working, not just in demos but in production.

Shipped & in progress

StockWise (multi-agent stock research with LangGraph, LlamaIndex, FastAPI) is public on GitHub. Research Pod, Graph RAG over arXiv, is in active development.

10K+
Users served by the RAG chatbot; resolution time went from about 2 days to near-instant
50%
Reduction in API latency after integrating feature stores into the ML serving layer
70%
Increase in retention model precision for dental practices using CLTV forecasting
Weeks → min
Prototype delivery time cut after shipping the no-code AI platform
Python LangChain LangGraph LlamaIndex HuggingFace PyTorch PEFT / QLoRA FastAPI Neo4j Qdrant AWS Docker Kubernetes XGBoost SQL