Skip to content
View harshita23sharma's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report harshita23sharma

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
harshita23sharma/README.md

Typing SVG

Data Scientist · 9 Years · Generative AI · LLMs · Ranking & Recommendation · MLOps
Translating business problems into data-driven solutions with measurable impact


🗂️ What I Build

🤖 Generative AI & Agentic Systems

  • Agentic content generation with human-in-the-loop controls

  • RAG pipelines for context-aware few-shot generation

  • LangGraph-based multi-step agentic workflows

  • Content moderation & compliance-aware generation

  • Prompt engineering, leakage prevention, eval frameworks

📊 Ranking & Recommendation Systems

  • LightGBM Ranker for hyper-personalized campaign selection

  • Multi-objective ranking: attribution, conversion, engagement

  • SHAP-based feature importance & model interpretability

  • Temporal stratification & custom evaluation methodologies

  • Multi-channel model design (email / push / SMS / WhatsApp)

🎯 Ad-Tech & pCTR Modelling

  • End-to-end pCTR model development across multiple markets

  • Cross-banner data unification pipelines

  • Market-agnostic HyperOpt automated hyper-parameter tuning

  • MLFlow experiment tracking · proxy data strategies for data gaps

  • Enabled 3rd-party → owned platform migrations

💬 Conversational AI & NLP

  • LangChain + GPT-3.5/4 chatbots with low-latency streaming

  • Two-step LLM Chain framework outperforming ReAct

  • BERT-based intent recognition with automated retraining

  • Complex dialogue flow design with seamless context switching

  • Multi-client deployment with customizable bot personas

🏥 Healthcare & Specialised ML

  • LSTM + attention model for medical code prediction from clinical notes

  • Multi-class multi-label classification on large imbalanced datasets

  • Document categorization for touchless pipeline automation

  • NLP on semi-structured clinical records

👁️ Computer Vision & Deep Learning

  • Real-time object detection, pose estimation, person classification

  • Alarm-based monitoring replacing manual camera feed review

  • Cybersecurity malicious activity detection via active learning

  • TFServing model deployment for production inference

⚡ Big Data & MLOps

  • Spark + Kafka real-time ETL pipelines at scale

  • Daily incremental feature pipelines with GCS integration

  • Spark UI profiling for cost & bottleneck optimisation

  • CI/CD for ML: unit tests, code coverage, pre-prod reviews

  • Automated scheduling & pipeline orchestration

🔍 Knowledge Systems & Retrieval

  • Production KnowledgeBase with daily web crawling & indexing

  • Multi-source retrieval integrated with conversational AI

  • Vector-based semantic search for few-shot example retrieval

  • Context-aware query response generation

  • RAG evaluation frameworks with YAML-based rubrics


🛠️ Tech Stack

Core Languages

Python SQL Scala

ML & Deep Learning

LightGBM PyTorch TensorFlow scikit-learn

Generative AI & Agents

LangChain LangGraph OpenAI

Data & MLOps

Apache Spark Kafka MLflow GCP AWS


🧭 Currently Exploring

🤖  Multi-agent orchestration & agentic evaluation
📐  Knowledge Graphs
🔬  AIOps
📦  Feature store design for real-time ML serving

"Good models explain the past. Great models change the future."

💬 Open to conversations on ML systems, GenAI, and anything in between.

Pinned Loading

  1. opensource_llms opensource_llms Public

    Jupyter Notebook 1 1

  2. CustomerChurn CustomerChurn Public

    End to end ML production pipeline for Customer Churn Pprediction

    Jupyter Notebook

  3. CustomerGenomics CustomerGenomics Public

    It will save time of online customers while shopping through online e commerce sites.Platform based sentiment analysis of products is done so that users can compare price,quality etc features of th…

    Python

  4. big_data_ml big_data_ml Public

    Jupyter Notebook

  5. MachineLearningAlgos MachineLearningAlgos Public

    Jupyter Notebook

  6. 30-days-of-effective-python 30-days-of-effective-python Public

    Includes ways to write better Python

    Python