Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Logistic Regression Model for Amazon Order Status Prediction #1266

Closed
wants to merge 1 commit into from

Conversation

Salma-Mamdoh
Copy link

Have you read the Contributing Guidelines ?

Yes

Description

Description:

This pull request includes the implementation of a machine learning pipeline to predict the status of Amazon orders using logistic regression. The main changes and additions are as follows:

  1. Data Loading and Initial Inspection:

Loaded the Amazon order dataset using pandas.
Displayed the first few rows to understand the dataset structure.
2. Data Preprocessing:

Identified categorical columns and converted them to the 'category' data type.
Implemented a pipeline to handle missing values and one-hot encode categorical features.
Standardized numerical features using a separate pipeline.
3. Feature Engineering:

Defined the feature set (X) by excluding the target variable (Status).
Split the dataset into training and testing sets (80% train, 20% test).
4. Model Building:

Created a preprocessing pipeline combining categorical and numerical transformers.
Implemented a logistic regression model with an increased iteration limit (max_iter=10000).
Combined the preprocessing pipeline and logistic regression model into a single pipeline.
5. Model Training and Evaluation:

Trained the logistic regression model on the training set.
Predicted the order statuses on the test set.
Evaluated the model using accuracy score, classification report, and confusion matrix.
6. Cross-Validation:

Implemented K-Fold cross-validation (5 folds) to assess the model's robustness.
Calculated and reported cross-validation scores and mean accuracy.
Results:

The model achieved an accuracy of 97.22% on the test set.
Detailed classification report and confusion matrix provided insights into the model's performance across different statuses.
Cross-validation confirmed the model's robustness with a mean accuracy of 97.21%.
Future Enhancements:

Explore other machine learning algorithms to further improve accuracy.
Incorporate additional features or data sources.
Implement advanced techniques to handle imbalanced classes.
Files Modified:

Added the script for loading, preprocessing, training, and evaluating the model.
Please review the changes and provide feedback or approval for merging into the main branch.
Fixes #1229

Checklist

  • [ T] I've read the contribution guidelines.
  • [ T] I've checked the issue list before deciding what to submit.
  • [T] I've edited the README.md and link to my code.

Related Issues or Pull Requests

Fixes #1229

Copy link

github-actions bot commented Jul 1, 2024

@Salma-Mamdoh

It's great having you contribute to this project

Thank you for opening a Pull Request 🙌 , Welcome to Project Guidance 💖 We will review everything and get back to you :)

@github-actions github-actions bot requested a review from Kushal997-das July 1, 2024 14:18
Copy link
Owner

@Kushal997-das Kushal997-das left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Salma-Mamdoh Hi, please make the following changes:

  1. Add the project to the basic folder.
  2. Include your project under this link.
  3. Add a few project output pictures to the readme.md file.

@Kushal997-das Kushal997-das added gssoc This level is for GSSOC Changes-required Little bit changes required . level1 Under level 1 labels Jul 18, 2024
@Kushal997-das Kushal997-das added Deadline-over. No_Update It's been so long you are not responding to this issue so we are going to close the issue soon. and removed Changes-required Little bit changes required . level1 Under level 1 labels Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deadline-over. gssoc This level is for GSSOC No_Update It's been so long you are not responding to this issue so we are going to close the issue soon.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Amazon Sales Prediction Model To Medium Machine learning and Data Science
2 participants