UofT-DSI | Scaling to production - Assignment 2 #53
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)
I focused on implementing a complete machine learning pipeline using the Adult dataset. I aimed to construct a robust preprocessing workflow, build a model pipeline with parameter tuning, evaluate its performance using cross-validation metrics, and assess its predictive power on the test dataset.
What did you learn from the changes you have made?
Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?
I thought about incorporating feature selection techniques to improve model efficiency and interpretability.
Were there any challenges? If so, what issue(s) did you face? How did you overcome it?
One challenge was ensuring the pipeline's robustness across different datasets or scenarios, particularly in handling missing data and categorical variables with varying cardinality. I addressed this by experimenting with different preprocessing strategies and validating the pipeline's performance across multiple folds in cross-validation.
How were these changes tested?
The changes were tested by running the notebook and verifying the output at each step.
A reference to a related issue in your repository (if applicable)
N/A
Checklist