Skip to content

This repository is to show my Data engineering & Data Analytics Skills, share projectss, and track my progress.

License

Notifications You must be signed in to change notification settings

pierrealexandre78/portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Data Engineer Portfolio

Hello world! My name is Pierre-Alexandre, and I'm excited to share my still developping data engineer portfolio. Within this repository, you'll find a comprehensive catalog of projects in various data engineering / analytics courses or group projects, each of which covers essential learning techniques and skills.

  • Brief overview: The goal of the project was to get to know how Apache Kafka streams works. For training purposes, I used an Azure free B1s machine and simulated the stream of the data to avoid memory problems on the Azure instance. Each event in the simulated streaming was created as a sample JSON file from an existing dataset. The streamed data was uploaded to an Azure Blob Storage and analyzed by a Data Factory,
  • Technology used: Azure (VM, Blob Storage, Azure Functions, Azure Devops, Azure Data Lake Analytics, Azure Synapse), Kafka, Docker, python
  • Outcome: Stream-processing live flight data with spark and uploading into azure blob storage for analytics using azure synapse
  • Brief overview: Apache Airflow was used to schedule and orchestrate ETL pipeline from 3 public API to BigQuery, transform using DBT, and display it on a streamlit webapp. Infra as Code using Pulumi was developped to deploy GCP instances automatically.
  • Technology used: GCP (VM, Buckets, BigQuery, Cloud Functions), Apache Airflow, DBT core, Pulumi, python
  • Outcome: Batch-processing data from 3 API into Bigquery every 10mn and display trafic on Streamlit dashboard
  • Brief overview: In this group project, we used Front end languages to develop a Chatbot as a Chrome extension for Doctolib Website. it answers basic user questions using keywords.
  • Technology used: Javascript, CSS, HTML, Google Developper (web extensions)
  • Final result: Chrome extension launching only on Doctolib website injecting HTML into the web page for Chatbot interaction
  • Brief overview: With my teammates, we predicted the chances of survival of patients entering intensive care unit at the hospital
  • Methodology: data cleaning, data analysis, machine learning, visualization , making conclusions
  • Technology used: python, pandas , scikit-learn, matplotlib , numpy, latex
  • Final results: analysis & visualization

Others

My LinkedIn Profile

About

This repository is to show my Data engineering & Data Analytics Skills, share projectss, and track my progress.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published