// initializing portfolio
0%
Work Experience Education Research Projects Skills Certifications Volunteering Contact
Open to Opportunities

Karthik
Mulugu

AI Engineer · Machine Learning · Data Scientist

karthik@ai-engineer ~
cat about.txt
Karthik Mulugu
scroll

Work Experience

AI Engineer Intern
Pronix Inc
● Present Remote · Oct 2025
  • Boosted retrieval accuracy by 35% by building RAG pipelines with Azure/OpenAI LLMs using embedding-based and hybrid retrieval techniques, enabling more context-aware chatbot responses.
  • Reduced HR support workload by 40% by designing and deploying chatbot workflows on Kore.ai Agentic Platform, integrating Workday APIs for real-time employee profile and leave management.
  • Increased chatbot task success rate by 25% and reduced fallback responses by 30% by optimizing intent classification, dialog state management, and workflow orchestration.
  • Eliminated critical edge-case failures by designing and executing scenario-based validation strategies within Salesforce for JBL, improving chatbot response consistency across high-priority workflows.
RAGNLPOpenAIAzureWorkday APIKore.aiGenerative AI
AI Automation Extern
Extern
Top Performer Remote · Jun–Sep 2025
  • Increased mortgage document-processing efficiency by 70% by building scalable Python automation pipelines to parse, chunk, and structure 1,000+ financial documents, reducing manual review time from hours to minutes.
  • Improved OCR accuracy by 35% and retrieval precision by 40% by optimizing Tesseract/EasyOCR and architecting a RAG pipeline using LangChain, LlamaIndex, and FAISS for high-precision document search and QA.
  • Boosted answer accuracy by 28% and reduced fallback responses by 35% by benchmarking and optimizing Gemini vs. Mistral models — earning recognition as Top Performer for technical impact.
PythonOCRRAGLlamaIndexFAISSGradioLLMs

Beyond work

ISS Volunteer
University at Buffalo · Aug 2024
Welcomed international grad students during Fall 2024 orientation, guiding them on campus resources and community integration.
Hackathon Coordinator
HackFiesta — TechnoMist 2K23 · Hyderabad
Organised and coordinated a large-scale hackathon with multiple teams, mentors, and event logistics end-to-end.

Education

UB
M.S. Computer Science — AI/ML 3.7 GPA
University at Buffalo, SUNY
Jan 2024 – Jun 2025 · Buffalo, NY

Specialised in ML, Deep Learning, Computer Vision, and Data Visualization — maintaining a 3.7 GPA while concurrently building production AI systems across NLP, RAG, and forecasting domains.

JNTU
B.Tech. Information Technology
Jawaharlal Nehru Technological University, Hyderabad
Aug 2019 – Jul 2023 · Hyderabad, India

Built solid foundations in DSA, DBMS, OOP, and systems programming. Ranked in the top 1% of the IT department for academic performance across core engineering coursework.

Research

IEEE Access Under Review Apr 2026
DocuQuery: Hybrid Lexical-Dense Retrieval with LangGraph Orchestration for Robust PDF Question Answering
Karthik Mulugu
Karthik Mulugu · University at Buffalo
↑75%
Faster Analysis
Document processing vs. single-retriever baseline
↑30%
Retrieval Precision
Over BM25-only across multi-intent query sets
3-way
Hybrid Fusion
BM25 + FAISS + TF-IDF with query-aware gating
DocuQuery is an open-source system for natural-language question answering over PDF documents. It combines hybrid retrieval — BM25, FAISS, and TF-IDF — with LangGraph multi-intent orchestration that routes queries across summarisation, comparison, refinement, and direct QA modes, backed by Gemini generation. A key finding is that fixed-weight hybrid fusion can degrade below pure lexical retrieval when dense similarity is misleading, motivating a query-aware fusion gating mechanism.

Projects

↑ 8 projects — click a tab to explore

PerceptAI — Real-Time Vision Intelligence
Real-time multi-modal computer vision system that simultaneously analyzes face, hands, body posture, and scene objects — streamed live to a browser dashboard with AI-generated behavioral reports.
  • Runs 5 concurrent ML models per frame — face mesh (468 landmarks), hand tracking (21 keypoints), body pose (33 keypoints), Facenet512 face recognition, and YOLO-World object detection — in a single CPU-only Docker container with no GPU dependency
  • 7-class emotion recognition with per-class confidence scores, PERCLOS drowsiness monitoring, gaze classification (Left/Center/Right), blink counting, and 3-axis head pose estimation — all computed locally with zero external API calls
  • Built cross-session face memory using Facenet512 embeddings stored in a ChromaDB vector database with DBSCAN clustering — re-identifies faces across sessions without storing raw images
  • Deployed on HuggingFace Spaces via Docker with a browser-camera WebSocket architecture — browser captures frames via WebRTC getUserMedia, no server camera needed, works on any cloud
PythonFastAPIMediaPipeYOLO-WorldDeepFaceTensorFlowChromaDBDockerWebSocket
Live app may take ~30s to wake
Hospital Inpatient Cost Prediction
End-to-end ML pipeline to predict patient costs, explain cost drivers with SHAP, and surface uncertainty estimates — built on 2.5M real NY state records for clinical finance decision-support.
  • Benchmarked 5 model families with Optuna; XGBoost achieved R²=0.969, RMSE=$2,035 — 58% RMSE reduction over linear baseline
  • SHAP TreeExplainer for feature attribution + quantile regression for calibrated 80% prediction intervals, enabling clinicians to act on uncertainty rather than point estimates
  • FastAPI service + 6-tab Streamlit dashboard + MLflow experiment tracking — all orchestrated via Docker Compose for one-command deployment
PythonXGBoostSHAPFastAPIStreamlitMLflow
Live app may take ~30s to wake
DocuQuery AI Assistant
Production-ready RAG system for natural-language question answering over PDF documents — combining hybrid retrieval with LangGraph multi-intent orchestration. Submitted to IEEE Access.
  • 75% faster document analysis via LangGraph orchestration routing queries across summarisation, comparison, refinement, and QA modes backed by Gemini generation
  • 30% retrieval precision gain using hybrid BM25 + FAISS + TF-IDF with cross-encoder reranking — outperforms pure dense retrieval on factual QA
  • Key finding: fixed-weight hybrid fusion degrades below pure lexical retrieval when dense similarity is misleading — motivating query-aware fusion gating (detailed in the paper)

→ see Research section for full paper details

LangGraphGeminiFAISSPineconeRAG
Live app may take ~30s to wake
Sentiment Analysis — DistilBERT + BiLSTM
Comparative study of transformer vs. recurrent architectures on 204K HuffPost headlines, with a live RSS pipeline that surfaces hour-by-hour sentiment trends from 6 major outlets.
  • Fine-tuned DistilBERT achieved 80.4% accuracy and 0.919 ROC-AUC, outperforming BiLSTM by 4.4 pts — gap traced to noisy pseudo-labels in multi-class setup, not model capacity
  • Live RSS ingestion from 6 outlets (BBC, Reuters, CNN, NYT, Guardian, AP) with hour-by-hour sentiment trend visualisation
  • Full MLOps stack: FastAPI inference service + Streamlit dashboard + MLflow tracking + Docker + GitHub Actions CI — deployed to Streamlit Cloud
PyTorchNLPFastAPIStreamlitDocker
Live app may take ~30s to wake
Amazon Stock Forecasting Dashboard
Multi-model forecasting dashboard on AWS EC2 — Seq2Seq LSTM reduced MAPE from 17.45% to 8.86% with technical indicator fusion and backtested trading signals.
  • Seq2Seq LSTM with encoder-decoder architecture trained on 6,300+ daily records; added RSI, MACD, and Bollinger Bands as input features — MAPE dropped from 17.45% to 8.86%
  • Sentiment analysis pipeline on financial news fused with price signals for buy/sell evaluation — backtested against S&P 500 benchmark
  • Production pipeline on AWS EC2 with systemd automation for continuous data refresh and model retraining; Streamlit dashboard with multi-timeframe selector
LSTMSeq2SeqStreamlitAWS EC2
Live app may take ~30s to wake
Customer Churn Prediction System
End-to-end ML pipeline that predicts at-risk telecom customers, explains the drivers behind each prediction, and surfaces actionable retention strategies — designed for direct business use.
  • Trained XGBoost on 7,000+ telecom records with 12 SQL-engineered features; achieved 88% recall on the minority (churn) class — the metric that matters most for retention campaigns
  • Cohort SQL analysis pinpointed that high-charge month-to-month customers drive 63% of churn, directly informing which segment to target first with discount offers
  • Deployed Streamlit app with a live decision-threshold slider and per-customer SHAP waterfall charts — lets non-technical stakeholders explore predictions without touching code
XGBoostSHAPSQLStreamlitTableau
Live app may take ~30s to wake
Human Action Recognition — CNN Models
Systematic benchmark of four CNN architectures for classifying 40 human actions across 9,500+ Stanford 40 images — comparing training-from-scratch versus transfer learning strategies.
  • Trained VGG16 and ResNet50 from scratch vs. GoogLeNet and DenseNet with ImageNet transfer learning — DenseNet transfer achieved best accuracy with 3x fewer training epochs
  • 80% real-time inference accuracy on held-out test set with a live Streamlit prediction app for webcam or image upload
  • Systematic ablation study comparing batch sizes, learning rates, and augmentation strategies across all four architectures — results documented with loss/accuracy curves
TensorFlowResNet50Transfer LearningStreamlit
Text Summarization — AI Agent + Groq
Autonomous AI agent with Groq Llama 3.3 70B that detects document domain, selects the optimal summarisation strategy, evaluates quality with ROUGE-L, and auto-retries on low scores.
  • 4-step agentic pipeline: detect domain → select prompt strategy → evaluate with ROUGE-L → auto-retry if score below threshold — no human intervention needed
  • SSE streaming for real-time token output; multi-modal input (text, PDF, URL); export as TXT, JSON, or Markdown — all via a single Vanilla JS frontend
  • FastAPI backend + Docker container + GitHub Actions CI/CD pipeline deploying to HuggingFace Spaces on every push
PythonGroqFastAPIDockerHuggingFace
Live app may take ~30s to wake

Skills

Programming
PythonSQLJavaC
Generative AI & LLMs
RAG PipelinesLangChainLlamaIndexAgentic AIPrompt EngineeringVector DatabasesSemantic SearchLLM Fine-tuningFAISSPinecone
Deep Learning
CNNsLSTMsTransformersRNNsTransfer LearningFine-TuningHugging Face
Machine Learning
Supervised LearningUnsupervised LearningFeature EngineeringModel OptimizationCross-ValidationHyperparameter TuningEDAModel Evaluation
Libraries
PyTorchTensorFlowScikit-learnPandasNumPyKerasStreamlitGradio
Data & Platforms
ETL/ELT PipelinesSnowflakePostgreSQLMySQLVertex AIAzure ML StudioSageMakerDatabricksBigQuery
MLOps & Deployment
MLflowDockerFastAPICI/CDExperiment TrackingInference Optimization
Cloud
AWS (EC2, S3, SageMaker)GCP
Visualization
TableauPower BIMatplotlibSeaborn
Dev Tools
GitGitHubVS CodeJupyterCursor
currently learning
Terraform
Infrastructure as Code
Databricks
Data Analytics Platform
BigQuery
Data Warehouse

Let's Build Something

Have a project in mind, a role to fill, or just want to talk AI?
My inbox is always open — I'll get back to you within 24 hours.

send_message.sh
karthik.ai