Available for Research Collaborations

Data Science
& Research
Enthusiast

2nd year CSE student, Senior Team Lead, and aspiring PhD researcher. I turn raw data into analytical narratives—bridging the gap between industry-grade machine learning and rigorous academic inquiry.

8+ Projects
40+ Certifications
98.2% Best AUC Score
Payal Mishra
Payal Mishra Data Scientist // Researcher // CSE 2nd Year

Experience &
Professional Path

Feb 2026
Mar 2026
Domain Senior Team Lead
UpToSkills
Headed department operations, overseeing multi-team coordination and strategic execution of data science projects. Directed cross-functional teams ensuring alignment with organizational objectives, timelines, and quality benchmarks.
Jan 2026
Feb 2026
Team Lead
UpToSkills
Led and mentored a team of data analyst interns, fostering collaboration and technical growth. Applied analytical thinking to real-world data science projects—performing EDA, data cleaning, and preprocessing to extract meaningful insights.
Jan 2026
Data Analyst Intern
SkillsCraft Technology
Analyzed datasets using Python and SQL to derive actionable insights, supporting business decision-making in a fast-paced tech training environment.
Dec 2025
Jan 2026
Data Analyst Intern
UpToSkills
Conducted data analysis using Python and SQL to generate actionable business insights. Applied statistical techniques to validate findings and support data-driven decisions.
Apr 2025
Sep 2025
Web Developer
Andaman Dream Yatra
Built and deployed a full website for a tours and travel agency in Sri Vijaya Puram. Currently maintaining the live site remotely.

Education

🏫
Bachelor of Technology
Computer Science Engineering
Dr. B.R. Ambedkar Institute of Technology
Nov 2024 — Present • Sri Vijaya Puram
🎓
Intermediate (PCM)
with Computer Science
St. Mary's Senior Secondary School
July 2007 — May 2022 • Sri Vijaya Puram

Selected Work &
Analytical Inquiries

01 // Machine Learning
Credit Card Fraud Detection System
A production-ready fraud detection pipeline achieving 98.21% AUC on Kaggle's 284K transaction dataset with a 0.17% fraud rate. Engineered to handle extreme class imbalance in real-world financial data.
Python Scikit-Learn Imbalanced-learn SMOTE Pandas Matplotlib
Research Impact
The core challenge was the 584:1 class imbalance ratio. By applying SMOTE oversampling and precision-recall optimization instead of standard accuracy metrics, the model achieves high sensitivity without compromising specificity—a critical tradeoff in financial fraud detection.
AUC Score
98.21%
on 284K Transactions
Class Imbalance Ratio
584:1
Solved via SMOTE
02 // Healthcare Tech
Child Vaccination Management System
A production-ready system compliant with India's National Immunization Schedule (NIS) 2025 for healthcare providers and parents. Streamlines schedule tracking and notification workflows.
Python SQL Healthcare Data NIS 2025
Analytical Insight
Modeled real-world immunization schedules as a state-machine: each child as a node, each vaccine as a timed edge. This graph-theoretic approach enables bulk schedule generation and missed-dose detection at scale.
03 // Time Series Analytics
Time Series Sales Forecasting
Transforms raw transactional data into daily and monthly sales series, visualizing trends and creating a baseline 6-month forecast to support inventory and planning decisions. Fully implemented in Python.
Python Pandas Matplotlib Time Series Kaggle
Analytical Insight
Applied seasonal decomposition to isolate trend, seasonality, and residual noise components. The 6-month forecast baseline serves as a reproducible benchmark for evaluating more complex ARIMA and Prophet models.
04 // Computer Vision
Facial Recognition Attendance System
Real-time attendance tracking via webcam using TensorFlow Lite and OpenCV, logging directly to CSV.
Python TensorFlow Lite OpenCV
Research Impact
Edge ML deployment: runs inference locally without cloud dependency, enabling offline use in low-connectivity educational settings.
05 // Big Data
US Accidents Big Data Analysis
Analyzed 7.7 million traffic records to identify accident hotspots and peak risk times via geospatial modeling.
Python Geospatial Sampling
Scale
7.7M records processed with strategic sampling to balance computational cost and statistical representativeness.
06 // Longitudinal Study
World Population Analysis
Analyzed World Bank data from 1960 to 2023, tracking global growth trends across 63 years of demographic data.
Python Pandas World Bank API
Research Impact
Automated cleaning workflows ensure reproducible pipelines—essential for academic-grade longitudinal research.

The Binary
Profile

+
Core Strengths
Current Competencies
End-to-End ML Pipelines
From raw data ingestion through EDA, feature engineering, model training, and evaluation using Python, Pandas, and Scikit-Learn.
Team Leadership under Pressure
Progressed from intern to Senior Team Lead within 3 months, coordinating multiple data science teams simultaneously.
Statistical Thinking
Deep comfort with regression, classification, imbalanced datasets, and time-series forecasting with an emphasis on valid inference over raw accuracy.
Rapid Certification & Self-Learning
40+ certifications from Google, Deloitte, Meta, Yale, and Cisco—demonstrating disciplined, continuous skill acquisition.
Computer Vision Applications
Deployed edge ML models with TensorFlow Lite and OpenCV for real-world attendance and recognition systems.
Research Methodology
Structured problem framing, reproducible workflows, and academic-grade documentation across all projects.
-
Future Areas of Mastery
PhD-Level Growth Targets
Deep Learning Theory
Bridging applied ML experience with rigorous mathematical foundations in neural architectures, backpropagation theory, and optimization landscapes.
Academic Research Writing
Developing the discipline of peer-reviewed publication: hypothesis formulation, literature synthesis, and structured scientific argumentation.
Large-Scale Distributed Computing
Moving beyond single-machine pandas workflows toward Spark, Dask, and cloud-native data processing for truly large datasets.
Advanced NLP and LLM Architecture
Building theoretical depth in transformer architectures, attention mechanisms, and fine-tuning strategies beyond surface-level API usage.
Causal Inference
Mastering the distinction between correlation and causation through Bayesian networks, instrumental variables, and do-calculus—critical for PhD-level research.
Domain Specialization
Converging broad interdisciplinary skills (healthcare, finance, social data) into a focused research niche for dissertation-level contribution.

Certifications &
Recognition

Data Science League — 2nd Place, All India Google Advanced Data Analytics Deloitte Data Analytics Simulation Google Data Science Foundations Forage Data Science Simulation Google Cybersecurity Professional Meta JavaScript Programming Google UX Design Professional Google Generative AI Yale Introduction to Psychology Nuts and Bolts of Machine Learning Regression Analysis — Google Data Science League — 2nd Place, All India Google Advanced Data Analytics Deloitte Data Analytics Simulation Google Data Science Foundations Forage Data Science Simulation Google Cybersecurity Professional Meta JavaScript Programming Google UX Design Professional Google Generative AI Yale Introduction to Psychology
NPTEL Design Thinking Cisco Introduction to Cybersecurity Google AI Essentials Accenture Web Analytics Google Fundamentals of Digital Marketing Microsoft Excel Specialist Full Stack Development — Udemy Figma High-Fidelity Prototypes Tally Prime and GST Google Linux and SQL Investment Risk Management Building Dynamic UI — Google NPTEL Design Thinking Cisco Introduction to Cybersecurity Google AI Essentials Accenture Web Analytics Google Fundamentals of Digital Marketing Microsoft Excel Specialist Full Stack Development — Udemy Figma High-Fidelity Prototypes

Let's build something
meaningful

Open to research collaborations, data science internships, and academic discussions. Currently laying the groundwork for PhD-level inquiry in machine learning and data-driven systems.