Autism Spectrum Disorder Prediction Using ML

Overview

This project aims to predict the risk of Autism Spectrum Disorder (ASD) using machine learning models. It involves multiple stages, including data preprocessing, model training, hyperparameter tuning, and deployment of the final model as a Streamlit application. The goal is to help identify potential signs of autism in children based on a questionnaire.

Project Files

app.py: The Streamlit application that allows users to input answers to a questionnaire and get a risk prediction for autism. The model used in this application is pre-trained and saved as trained_model.pkl.
main.py: The main script that includes data preprocessing, training, evaluation, and hyperparameter tuning of various machine learning models.
sql_questions.sql: Contains SQL queries used to perform data analysis on various aspects of the dataset, such as gender distribution and ethnicity among individuals with autism.
presentation.pdf: Presentation slides that summarize the project, including the overview of ASD, the models used, the average scores achieved, and future directions.
tableau.twb: Tableau workbook used to visualize and analyze data related to ASD screening.

Machine Learning Models

The models used in this project include:

Logistic Regression
Support Vector Classifier (SVC)
Naive Bayes (GaussianNB and MultinomialNB)
MLPClassifier (Neural Network)
SGDClassifier (Stochastic Gradient Descent)
KNeighborsClassifier (K-Nearest Neighbors)
Decision Tree Classifier
Random Forest Classifier (with hyperparameter tuning)
Gradient Boosting Classifier (with hyperparameter tuning)
LGBMClassifier
XGBoost Classifier

The models were evaluated using cross-validation, and the average ROC AUC score achieved was 0.9091, indicating that the model is effective in correctly identifying classes with over 90% accuracy.

Streamlit Application

The Streamlit app (app.py) allows users to interact with the model by answering questions related to social behavior, communication, and routine changes. Based on these answers, the app predicts the risk of autism and provides guidance on the next steps for assessment.

To run the app, use the following command:

streamlit run app.py

Make sure to have the pre-trained model (trained_model.pkl) in the code directory before running the app.

SQL Queries

The project also includes analysis performed using SQL, such as:

The number of users diagnosed with autism.
Gender distribution among individuals with autism.
The average age of individuals with autism versus those without.
Main ethnicities among individuals diagnosed with autism.
The link between family history of autism and diagnosis of ASD.

These queries help provide more insights into the dataset and support data-driven decision-making.

Data Visualization

The Tableau workbook (tableau.twb) contains visualizations that provide insights into the dataset, such as age distribution, gender distribution, and other factors related to autism diagnosis.

Presentation

The presentation.pdf file provides a summary of the project, including an introduction to ASD, the machine learning models used, the results achieved, and future steps. It serves as a comprehensive overview of the project's objectives and outcomes.

How to Use

Clone the repository to your local machine.
Ensure you have Python and the necessary libraries installed.
Run the Streamlit app (app.py) to interact with the prediction model.
Use the provided SQL queries to further analyze the dataset.
Explore the Tableau workbook for data visualizations.

Requirements

Python 3.7+
Streamlit
Joblib
Pandas
Numpy
Scikit-learn
XGBoost
LightGBM
Tableau (for visualization)

Future Directions

The project can be further enhanced by:

Collecting more diverse datasets to improve model generalization.
Incorporating additional features such as genetic or medical history for better predictions.
Deploying the model as a web service for broader accessibility.

References

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autism Spectrum Disorder Prediction Using ML

Overview

Project Files

Machine Learning Models

Streamlit Application

SQL Queries

Data Visualization

Presentation

How to Use

Requirements

Future Directions

References

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code		code
data		data
README.md		README.md
presentation.pdf		presentation.pdf
sql_questions.sql		sql_questions.sql
tableau.twb		tableau.twb

cbertopt/project-6-ds-end-to-end

Folders and files

Latest commit

History

Repository files navigation

Autism Spectrum Disorder Prediction Using ML

Overview

Project Files

Machine Learning Models

Streamlit Application

SQL Queries

Data Visualization

Presentation

How to Use

Requirements

Future Directions

References

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages