I'm a Data Scientist who enjoys working with numbers, developing models, and designing clear visuals that communicate insights effectively and help people make better decisions.
About Me
Hi, I am Anbu. I see data as something both mysterious and powerful. My journey began in a startup during an internship in 2021, where I realized I was not just cleaning data and building dashboards.
I was solving real problems, finding patterns, and delivering insights that mattered. That experience shaped my decision to pursue a Master’s in Data Analytics at The George Washington University.
Since then, I have developed a strong foundation in Python, SQL, Tableau, Power BI, and AWS. I also earned certifications in AWS Machine Learning Foundations and Stanford’s Machine Learning course, both of which helped me solidify my understanding of core ML algorithms and how to translate raw data into real insights. I also secured 2nd place in GWU's AI/ML Bracket Challenge (2024), achieving a score of 1210 points.
I have a strong appetite for learning and am always eager to explore new technologies. To me, data is like a puzzle, and I enjoy piecing together the elements to reveal the bigger picture through modeling, analysis, or clear visuals.
Outside of data, I enjoy reading, digital illustration, design, and music. I am always curious, always learning, and always ready to connect the dots.
Education
The George Washington University
Sri Ramachandra Engineering and Technology
Skills
Professional Experience
Data Analyst Intern
- Developed Python-based data preprocessing pipeline for 3+ years of e-commerce data (14K+ records), reducing processing time by 90% and improving data accuracy by 80%
- Executed K-Medoids-based customer segmentation, identifying actionable user clusters to optimize marketing strategies and improve customer retention
- Performed churn analysis using Random Forest, achieving 81% accuracy in predicting customer behavior for targeted retention strategies
Flutter App Developer
- Designed user-friendly interface for internship portal, increasing user engagement by 40%
- Integrated web services to streamline job postings and candidate data retrieval, reducing data processing time by 50%
- Conducted extensive testing and debugging to ensure seamless user experience and enhance platform reliability
Machine Learning Projects
Healthcare ACL Injury Prediction
A machine learning model to predict Anterior Cruciate Ligament (ACL) injuries using healthcare data, aiming for early diagnosis and prevention.
Sparks Foundation Prediction Tasks
A collection of prediction tasks from The Sparks Foundation GRIP, showcasing various supervised and unsupervised learning models.
ATIS Intent Classification
An NLP model to classify user intents from the Airline Travel Information System (ATIS) dataset, crucial for building effective chatbots.
ASL Recognition
A computer vision project to recognize American Sign Language gestures from images using deep learning models.
Crime Analytics
An analysis of crime data to identify patterns, hotspots, and trends, providing insights for law enforcement and public safety strategies.
SQL Projects
Nobel Prize Analysis
A SQL-based exploration of the Nobel Prize dataset to uncover trends, patterns, and interesting facts about the prestigious award and its laureates.
Flight Delays Analysis
Using SQL to query and analyze a large dataset of flights to identify the primary causes of delays and evaluate airline performance.
Data Visualization Projects
Certifications
My Project Blogs
Decoding Images: GPT-4’s Visual Brilliance
An exploration of GPT-4's image analysis capabilities, from understanding complex scenes to generating detailed descriptions.
Classify or Defy: Exploration of Image Classifier Limits
A deep dive into the boundaries of image classifiers, examining how they perform with ambiguous or out-of-distribution images.
Sky-High Solutions: YOLOv8 for Wind Turbine Detection
This article details the process of using YOLOv8 for detecting wind turbines, a key task in renewable energy monitoring.
Digital Diagnosis: Mapping Basal Cell Carcinoma
Using computer vision techniques to analyze and map the development of Basal Cell Carcinoma from medical imaging.
The Journey of Plastic: From Production to Ocean Pollution
A data-driven story visualizing the lifecycle of plastic and its impact on our oceans, using various data visualization techniques.
Get In Touch
Ready to collaborate on exciting data science projects? Let's connect and discuss how we can turn data into valuable insights.