This project aimed to analyze the QSAR Biodegradation dataset using five classification models: Logistic Regression, K-Nearest Neighbour (KNN), Support Vector Classifier (SVC), Decision Tree Classifier, and Random Forest Classifier. The models were trained and evaluated using sklearn metrics.
- Logistic Regression
- K-Nearest Neighbor (KNN)
- Support Vector Classifier
- Decision Tree Classifier
- Random Forest Classifier
The QSAR biodegradation dataset was built in the Milano Chemometrics and QSAR Research Group. The data have been used to develop QSAR (Quantitative Structure Activity Relationships) models for the study of the relationships between chemical structure and biodegradation of molecules. Biodegradation experimental values of 1055 chemicals were collected from the webpage of the National Institute of Technology and Evaluation of Japan (NITE).