The Industrial Copper Modeling project aims to address challenges in the copper industry related to sales and pricing. The project involves the development of machine learning models for predicting selling prices and classifying leads as either successful (WON) or unsuccessful (LOST).
- Python Scripting
- Data Preprocessing
- Exploratory Data Analysis (EDA)
- Machine Learning Regression
- Machine Learning Classification
- Streamlit for Web Application Development
The copper industry faces issues with skewed and noisy data affecting the accuracy of manual predictions. The project seeks to automate and optimize pricing decisions through the use of machine learning regression models. Additionally, a lead classification model is developed to evaluate and classify leads based on the likelihood of conversion.
- Exploratory Data Analysis (EDA):
Explore skewness and outliers in the dataset.Visualize key insights from the data.
- Data Preprocessing:
Transform the data into a suitable format. Address skewness and outliers through normalization and outlier detection. Perform necessary cleaning steps.
- Machine Learning Regression:
Develop a regression model to predict the continuous variable 'Selling_Price'. Utilize advanced techniques such as data normalization and feature scaling.
- Machine Learning Classification:
Build a classification model to predict the status (WON or LOST) of leads. Use the 'STATUS' variable for training, considering 'WON' as success and 'LOST' as failure. Remove data points with STATUS values other than 'WON' or 'LOST'.
- Streamlit Web Application:
Create a Streamlit web application to facilitate easy interaction with the models. Users can insert each column value, and the application will predict Selling_Price or lead Status.