Skip to content

Files

Latest commit

 

History

History
63 lines (47 loc) · 4.53 KB

README.md

File metadata and controls

63 lines (47 loc) · 4.53 KB

Emilio Conde Data Science Portfolio

Note: The output is in an HTML file. For accurate visualization, please download the file.

This project examines the key differences between businesses started by one person and those started by a team of co-founders. The goal is to figure out what makes a company successful and what might cause it to close or face acquisition.

  • Data was taken from Pitchbook Platform and extracted via web scraping through entrepreneur's LinkedIn profiles.
  • The analysis includes differences between funding rounds, funding location, skill comparison across teams, and more.
  • Model was built using logistic regression
  • 90%+ accuracy across both training and testing datasets.
  • Predictions were made on data from real companies

Time to move across rounds:

  • Solo Founders can take up to twice as more months than co-founder teams when securing a funding round

Area Under the Curve for logistic regression model:

AUC for testing dataset AUC for training dataset
94.1% 93.9%

Predictions on real world data:

Chance of Company A being successful Chance of Company B being successful
87% 9%

This project significantly enhanced the efficiency of data scraping from company profiles, providing tangible benefits to Endeavor Mexico.

  • The code is fully prepared for use and deployment across different computers. Note for first-time Selenium users: Please ensure to install necessary libraries to prevent any errors.
  • Code only needs a list with the URLs to retrieve in a .csv format.
  • Added value: The scraper retrieves comprehensive employee demographic information, including locations, majors, skills, and occupations.
  • Executing all the provided code will yield a clean, ready-to-analyze data frame for any required analysis.

Data Frame output glimpse:

Company Name Industry Location Employees N1_Country Country1 N1_School School1 N1_Major Major1
STUCK? Education Al Rabie District, Riyadh 8 2 Egypt 4 University of Oxford 3 Teacher Education Multiple Levels
THIQAH IT Services and IT Consulting Al Sahafah Dist, Riyadh 1100 916 Saudi Arabia 218 King Saud University 169 Computer Science
MRSOOL مرسول Technology, Information and Internet Riyadh, RIYADH 500 1343 Saudi Arabia 37 King Saud University 35
Dsquares Technology, Information and Internet Riyadh 4 170 Egypt 30 Cairo University 32 Computer Science
Soum Internet Marketplace Platforms Riyadh 61 43 Saudi Arabia 11 King Fahd University of Petroleum & Minerals 6 Industrial Engineering

Explore my portfolio of sports-related projects where I blend my passion for data analysis with my love for sports.

Tracking NBA Season Leaders Over a Decade:

  • I've delved into the past decade's box score data using SQLite for a deep-dive analysis.
  • The goal of this project was to do data manipulation using SQL with a minimal touch-up on R.
  • The resulting graph paints an exciting picture, highlighting the league's top performers across key stats like: 3PT%, PPG, Assists, and Rebounds.
  • SQL Queries & Data Source

A Deep Dive Into Luka Doncic's 60-Point Game: Shot Chart Analysis:

  • Luka had a record-breaking game on december 27, 2022. What was his shot selection and shot making during that game?