Welcome to the NBA Spread Predictor! This project leverages machine learning to predict the point spread outcomes of NBA games based on historical data and various statistical features. The project showcases my skills in data science, machine learning, web development, and software engineering.
- Project Overview
- Features
- Technologies Used
- Setup and Installation
- Usage
- Project Structure
- Data Collection
- Machine Learning Model
- Streamlit Web App
- Future Improvements
- License
- Data Scraping: Collects data from various NBA-related websites.
- Data Preprocessing: Cleans and preprocesses the scraped data for analysis.
- Model Training: Trains machine learning models to predict game spreads.
- Interactive Web Interface: Provides an interactive web interface using Streamlit for users to input game details and get predictions.
- Deployment: Deployable on Streamlit Community Cloud for easy access.
- Python: The main programming language used.
- Pandas: For data manipulation and analysis.
- Scikit-learn: For machine learning model development.
- Jupyter Notebooks: For data exploration and prototyping.
- Streamlit: For building the web application.
- BeautifulSoup: For web scraping.
- Git: For version control.
- Python 3.8 or higher
- Git
-
Clone the repository:
git clone https://github.com/noahw8299/nba-spread-predictor.git cd nba-spread-predictor
-
Set up a virtual environment and activate it:
python3 -m venv nba_env source nba_env/bin/activate # On Windows use `nba_env\Scripts\activate`
-
Install the dependencies:
pip install -r requirements.txt
If you wish to interact with the code on your personal device:
-
Start the Streamlit app:
streamlit run app/app.py
-
Open your browser and go to
http://localhost:8501
to interact with the app.
If you wish to see and interact with the app without setting it up locally, you can access it here.
nba-spread-predictor
├── app
│ ├── app.py # Main application file for Streamlit
│ ├── predict_page.py # Prediction page script
│ ├── predict_model.pkl # Trained model for predictions
│ ├── model_scaler.pkl # Scaler used for feature scaling
│ ├── images # Directory for storing images
│ ├── result.csv # CSV file with historical game data
├── model
│ ├── predict # Directory for prediction related scripts and notebooks
│ ├── scrape # Directory for scraping related scripts and notebooks
├── nba_env # Virtual environment directory (not tracked by git)
├── requirements.txt # List of dependencies
├── README.md # Project README file
The data is collected from various NBA-related websites using web scraping techniques. The scripts in the scrape
directory are responsible for fetching and storing this data.
scrape/get_data.ipynb
: Notebook for scraping game data.scrape/parse_data.ipynb
: Notebook for parsing and cleaning scraped data.
The machine learning models are developed using scikit-learn. The models are trained on historical NBA game data to predict future game spreads.
model/predict/nba-predict-spread.ipynb
: Notebook for model development and training.model/predict/nba-predict-v2.ipynb
: Updated model training notebook.
The web app is built using Streamlit, providing an interactive interface for users to input game details and view predictions.
app/app.py
: Main application file.app/predict_page.py
: Script for the prediction page.
- Model Improvement: Enhance the model accuracy by incorporating advanced machine learning techniques.
- Data Visualization: Add more visualizations to the web app to better represent the data and predictions.
- User Authentication: Implement user authentication for personalized predictions and history tracking.
This project is licensed under the MIT License. See the LICENSE file for details.