Likhith Raj Yesala

Data Scientist · Data Analyst · Gen AI / LLM Engineer
I build data systems I'd be comfortable putting my name on — private by default, honest about their limits.

🧭 How I build

The thread through everything here isn't a single tech stack — it's a way of working:

🔒 Privacy by default. My healthcare work ships only de-identified aggregate data; my LLM tooling runs 100% locally — no keys, no data leaving your machine. Sensitive data stays where it belongs.
🧪 Honest about limits. I document where a method breaks down rather than hiding it — e.g. my DEA study openly flags the small-sample limitation instead of overclaiming. Surfacing the caveat is part of the analysis.
✅ Tested & reproducible. Real test suites (13 / 51 passing across repos), Dockerfiles, and one-command setup — so the work runs for someone other than me.
📖 Documented for humans. Every repo explains the why, not just the how.

📈 The progression

I didn't start here. The repos below are a visible growth arc — I keep the old versions around on purpose so the journey shows.

	Then	Now
Apps	B.Tech Tkinter desktop tool (2018)	FastAPI services, JWT auth, Docker, deployed (CareFlow)
Analytics	Spreadsheets & one-off scripts	Reproducible pipelines + tested solvers (DEA)
AI	Classic ML (Random Forest, XGBoost)	Local LLM tooling, RAG, load-balancing (llm-balance-paraphraser)
Data viz	Static charts	Interactive D3 drill-downs over millions of rows (ICU explorer)

📌 Featured work

Project	What it proves	Track
CareFlow — clinic appointment & live-queue API	Progression: a 2018 Tkinter desktop app rebuilt into a FastAPI service with JWT auth, Postgres, Docker, 13 passing tests. The legacy code is kept in-repo so you can see exactly what was upgraded.	`Eng` · `DA`
llm-balance-paraphraser — local LLM analyzer & router	Gen AI + ethics: token / KV-cache / VRAM analysis, an Ollama paraphrase pipeline, and a weighted load-balancer with health checks. Runs entirely on your machine — no API keys, no data leaving. 51 passing tests.	`GenAI`
Sustainable-Manufacturing-DEA — efficiency benchmarking	Research integrity: a from-scratch DEA solver benchmarking Nestlé/Henkel/P&G/Unilever across 50+ ESG KPIs — with an explicit, written note on the method's small-sample limitation.	`DS` · `DA`
Pediatric ICU Cohort Explorer (making public soon)	Privacy-first healthcare: a Flask + D3 drill-down over 4.4M+ clinical observations that ships only de-identified aggregate counts.	`DS` · `DA`