UofT-DSI | Scaling to production - Assignment 2 #53

movcha · 2024-07-13T03:35:56Z

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

I focused on implementing a complete machine learning pipeline using the Adult dataset. I aimed to construct a robust preprocessing workflow, build a model pipeline with parameter tuning, evaluate its performance using cross-validation metrics, and assess its predictive power on the test dataset.

What did you learn from the changes you have made?

Preprocessing techniques for numerical and categorical data, including imputation and scaling.
Constructing pipelines in scikit-learn to streamline model training and evaluation.
The significance of cross-validation in assessing model performance and tuning hyperparameters effectively.
Importance of setting a random state for reproducibility in machine learning experiments.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

I thought about incorporating feature selection techniques to improve model efficiency and interpretability.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

One challenge was ensuring the pipeline's robustness across different datasets or scenarios, particularly in handling missing data and categorical variables with varying cardinality. I addressed this by experimenting with different preprocessing strategies and validating the pipeline's performance across multiple folds in cross-validation.

How were these changes tested?

The changes were tested by running the notebook and verifying the output at each step.

A reference to a related issue in your repository (if applicable)

N/A

Checklist

I can confirm that my changes are working as intended

UofT-DSI | Scaling to production - Assignment 1

github-actions · 2024-07-13T03:36:07Z

Hello, thank you for your contribution. If you are a participant, please close this pull request and open it in your own forked repository instead of here. Please read the instructions on your onboarding Assignment Submission Guide more carefully. If you are not a participant, please give us up to 72 hours to review your PR. Alternatively, you can reach out to us directly to expedite the review process.

“movcha” and others added 7 commits July 11, 2024 00:27

Assignment 1

2820543

worked on lab 1&2

45d2692

Merge branch 'main' into assignment-1

171da31

updated assignment-1

6026a08

Merge pull request #1 from movcha/assignment-1

0866097

UofT-DSI | Scaling to production - Assignment 1

files for main

f08f189

final assignment_2

4e5bac2

movcha closed this Jul 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UofT-DSI | Scaling to production - Assignment 2 #53

UofT-DSI | Scaling to production - Assignment 2 #53

movcha commented Jul 13, 2024

github-actions bot commented Jul 13, 2024

UofT-DSI | Scaling to production - Assignment 2 #53

UofT-DSI | Scaling to production - Assignment 2 #53

Conversation

movcha commented Jul 13, 2024

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

What did you learn from the changes you have made?

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

How were these changes tested?

A reference to a related issue in your repository (if applicable)

Checklist

github-actions bot commented Jul 13, 2024