Data Science Enthusiast | Applied Data Science & Analytics Student | Passionate about ML, AI, Data Engineering
I'm currently pursuing an M.Sc. in Applied Data Science and Analytics at SRH Hochschule Heidelberg, where I am honing my skills in data science, machine learning, and big data. With a solid foundation in programming, data engineering, and analytics, I'm passionate about creating data-driven solutions that provide actionable insights.
- Programming Languages Python, SQL, R
- Data Science & Analytics Machine Learning (ML), NLP, Neural Networks, Generative AI, Data Mining, Predictive Analytics, Regression, Classification, Clustering
- Data Engineering ETL, Data Pipelines, Hadoop, Spark, Data Scraping, HDFS, BigQuery, MySQL, SQLite
- Cloud Technologies Google Cloud Platform (GCP), Microsoft Azure
- Visualization & Reporting Power BI, Tableau, SAS Visual Analytics, Excel
- Libraries & Frameworks Pandas, Scikit-learn, TensorFlow, PyTorch, Scrapy, NLTK, Spacy
- Tools & Technologies Git, Docker, Google Cloud SDK, API Integration, MS Office (Excel, Word, PowerPoint)
- Soft Skills Leadership, Team Collaboration, Communication, Agile Methodology, Multi-tasking
-
German Biography Generator Developed a summarization tool using a large language model (LLM) to generate concise German biographies from various document formats (Word, CSV, PDF). Focused on coherence and exception handling. Tech Stack: NumPy, Pandas, Flask, API, NLP
-
Hush Hush Recruiter Candidate Selection Led a project to streamline candidate selection by extracting GitHub data via APIs, applying K-means clustering for filtering, and automating candidate communication through emails. Tech Stack: Scikit-learn, NumPy, Pandas, SQLite, Vercel
-
Data Pipeline Project Constructed a data pipeline integrating data fetching, processing, and storage in Google BigQuery, with visualizations in Tableau. Docker and GCP SDK were utilized for seamless setup and portability. Tech Stack: Docker, MySQL, Google BigQuery, Tableau, Google Cloud SDK
-
Integrated Data Pipeline: Hadoop, Scraping, DB, Testing Developed a data scraping solution for an anime-related website, orchestrated scalable Hadoop infrastructure, and handled data analysis using HDFS, MySQL, and SQLite. Tech Stack: Docker, Hadoop, PySpark, Scrapy, MySQL, GCP
-
Chatbot using RAG (Retrieval-Augmented Generation) Built a chatbot using the LLAMA3 RAG model to assist students in answering queries by orchestrating multiple modules.
-
Prime Video Data Analysis with Power BI Designed an interactive Power BI dashboard to analyze Prime Video content, offering insights for production strategies and enhancing audience engagement. Tech Stack: Power BI, Data Profiling, Data Cleaning
- Tech Mahindra, Bengaluru, India
Automation Testing (Mar 2022 β June 2023) Automated 7+ processes using Perl and Python, improving workflow efficiency by 18%. Optimized database queries, contributing to faster data retrieval times. Maintained dashboards for real-time tracking and reporting. Coordinated with project managers to meet development timelines and reduce manual effort by 40%.
Associate Software Engineer (Aug 2021 β Feb 2022) Resolved 30+ software bugs, improving system performance and reliability. Executed 40+ test scripts, achieving a 95% defect resolution success rate.
-
M.Sc. in Applied Data Science & Analytics SRH Hochschule Heidelberg (Oct 2023 β Present) Focus on Data Analytics, Data Mining, Big Data, SQL, Python, R
-
Post Graduate Program in Data Science Purdue University (Aug 2022 β Mar 2023) ML, Power BI, Python, R
Awards:
- βPat on Backβ Award at Tech Mahindra (2022)
- βBest Team Awardβ at Tech Mahindra (2022)
Post Graduate Program in Data Science AZ-900: Microsoft Azure Fundamentals SQL - MySQL for Data Analytics Tableau Desktop 10 & SAS Certification
- LinkedIn: https://www.linkedin.com/in/rakesh-hs/
- Email: [email protected]