Skip to content

Teraces12/IBM_Data_Science_Capstone_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title author date output
Winning Space Race with Data Science
Dr LEBEDE Ngartera
12--23--2023
html_document pdf_document
toc number_sections
true
true
toc
true


🚀 Applied Data Science Capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarizes in the form of project all materials that have been learned during this specialization.

📄 Project Background

SpaceX is the most successful company of the commercial space age, making space travel affordable. The company advertises Falcon 9 rocket launches on its website, with a cost of 62 million dollars; other providers cost upward of 165 million dollars each, much of the savings is because SpaceX can reuse the first stage. Therefore, if we can determine if the first stage will land, we can determine the cost of a launch. Based on public information and machine learning models, we are going to predict if SpaceX will reuse the first stage.

📄 Questions to be answered

  1. How do variables such as payload mass, launch site, number of flights, and orbits affect the success of the first stage landing?
  2. Does the rate of successful landings increase over the years?
  3. What is the best algorithm that can be used for binary classification in this case?

📄 Methodology

1.Data collection methodology

  1. Using SpaceX Rest API
  2. Using Web Scrapping from Wikipedia

2. Performed data wrangling

  1. Filtering the data
  2. Dealing with missing values
  3. Using One Hot Encoding to prepare the data to a binary classification

3. Performed exploratory data analysis (EDA) using visualization and SQL

4. Performed interactive visual analytics using Folium and Plotly Dash

5. Performed predictive analysis using classification models

Building, tuning and evaluation of classification models to ensure the best results