- Programming Languages: Python, R
- Machine Learning & Data Science: Scikit-Learn, TensorFlow, NLP (SpaCy, NLTK), NumPY, Pandas, PyTorch, XGBoost, LightGBM, CatBoost, Feature Engineering, Model Optimization
- Database Management: MySQL, SQL, NoSQL, Snowflake
- Data Visualization & Analytics: Power BI, Tableau, Matplotlib, Seaborn
- MLOps & Deployment: Docker, Kubernetes, FastAPI, Flask, Model Monitoring (MLflow, TensorBoard)
- Big Data & Cloud: Google Cloud Platform (GCP), AWS, Apache Spark, Hadoop, Apache Kafka
- Spreadsheets: MS Excel, Google Sheets
-
Heuristics Pharma Perception – Data Analyst
January 2025 – Present | New York, NY
- Engineered a model-serving infrastructure using Kubernetes and FastAPI, reducing model inference time by 30% while enabling scalable microservices for model APIs.
- Developed an embeddings-based similarity model using FAISS and OpenAI embeddings, enhancing content-based retrieval and improving response accuracy by 15%.
- Built and fine-tuned NLP pipelines leveraging BERT models for automated entity recognition and classification in medical reports, resulting in 20% increase in processing speed.
- Integrated a high-speed vector search pipeline with Neo4j and Elasticsearch to enable semantic question-answering over large datasets with latency under 300ms.
- Automated model monitoring and retraining workflows using MLflow & Prefect, achieving continuous model improvement and reducing manual intervention by 70%.
- Orchestrated ETL workflows for unstructured data processing using Apache Airflow and deployed data transformation scripts across multiple data lakes and APIs.
-
Sriroz Consultants Private Limited – Business Analyst
November 2021 – July 2023 | Pune, India
- Redesigned SQL-based sales reports to integrate real-time customer purchase behavior analysis, reducing report generation time by 30% and enabling faster decision-making.
- Performed exploratory data analysis (EDA) on customer retention patterns using Python and MySQL, identifying key churn factors and suggesting strategies that reduced churn by 12%.
- Developed interactive Power BI dashboards that automated sales reporting and improved forecasting accuracy by 25% through data modeling and DAX optimization.
- Integrated Python-based CRM automation using Pandas and Selenium to track lead-to-conversion journeys, increasing conversion by 20% and adding $105K in revenue.
- Conducted competitive pricing analysis using web-scraped competitor data and market trends, which informed dynamic pricing strategies that boosted profit margins by 12%.
- Implemented customer segmentation models (K-means and decision trees) that improved targeted marketing campaigns, leading to a 22% increase in repeat purchases.
-
AI-Powered Patient Risk Prediction System
January 2025 – March 2025
- Developed a machine learning model for early prediction of patient readmission risk using XGBoost & Random Forest algorithms.
- Automated patient data ingestion from EHRs and wearables using Apache Airflow and stored in Snowflake warehouse.
- Built and monitored live pipelines with Apache Spark Streaming and Kafka, enabling real-time risk dashboarding.
- Implemented BERT models to extract risk indicators from unstructured physician notes, improving feature richness by 30%.
- Deployed containerized model to HIPAA-compliant Kubernetes infrastructure on AWS, ensuring compliance and scalability.
-
Emotional Chatbot for Multimodal Emotion Recognition
December 2024
- Built emotion recognition chatbot integrating text-based NLP (Random Forest) and voice-based CNN model (Keras) with 90% accuracy.
- Designed modular structure supporting APIs for music (Spotify), videos (YouTube), activities (Eventbrite), and podcasts (Listen Notes).
- Trained models on augmented datasets with oversampling and feature engineering for improved F1-score of 0.88.
- Developed and deployed UI using Flask and Bootstrap for real-time usage in mental health and education domains.
- Learning SnowflakeDB
- Microsoft Project for the Web: Reporting with Power BI
- MySQL Data Analysis
- BCG - Data Science Job Simulation
- Goldman Sachs - Software Engineering Job Simulation
- Quantium - Data Analytics Job Simulation
- R for Data Science: Analysis and Visualization
- Excel Skills for Business: Intermediate I & 2
-
Pace University, Seidenberg School of Computer Science and Information Systems
2023 – 2025 | New York
MS in Information Systems | Concentration: Information Systems | Honors: Masters
-
Savitribai Phule Pune University – Modern Institute of Business Studies
2020 – 2022 | Pune, India
MBA | Concentration: Business Analytics | Honors: Masters
-
Savitribai Phule Pune University – Dr. D. Y. Patil College
2017 – 2020 | Pune, India
BBA | Concentration: HR Management | Honors: Bachelors