Complete implementation of assignment_2 model pipeline and metrics #79

marjanrajabi437 · 2024-11-11T17:29:40Z

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

I am adding performance metric calculations, including negative log loss, ROC AUC, accuracy, and balanced accuracy for test data. Additionally, I’m restructuring the code to display fold-level results in a more readable, sorted format.

What did you learn from the changes you have made?

From these changes, I learned how to use cross_validation to extract and evaluate fold-level metrics across different folds, and I gained a better understanding of calculating metrics for model performance on test data.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

Yes, I considered using a separate function to handle the metric calculations and display results for both training and testing, to make the code more modular and reusable. This approach could improve readability, especially as the complexity of metrics increases.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

Yes, I faced some challenges with the cross_validate function, especially in sorting results and calculating fold-level metrics directly. There were also issues with handling prediction probability arrays in metrics like roc_auc and log_loss. I addressed these by adjusting the data passed to predict_proba and breaking down the metric calculations step-by-step.

How were these changes tested?

These changes were tested by running the pipeline on a sample of test data and checking that each metric calculation matched the expected output. I also verified each step's output to ensure that the metrics align with known values for model performance.

A reference to a related issue in your repository (if applicable)

Refer to sections 3a, 3b, and 4 in the lab materials within the repository.

Checklist

I can confirm that my changes are working as intended
Code is free of syntax errors and runs without issues
Metrics display correctly for each fold and are sorted by neg_log_loss for clarity
Code is modular and well-documented for easy future adjustments
Output is formatted to improve readability and analysis

github-actions · 2024-11-11T17:29:52Z

Hello, thank you for your contribution. If you are a participant, please close this pull request and open it in your own forked repository instead of here. Please read the instructions on your onboarding Assignment Submission Guide more carefully. If you are not a participant, please give us up to 72 hours to review your PR. Alternatively, you can reach out to us directly to expedite the review process.

marjanrajabi437 · 2024-11-11T21:00:05Z

Hello, Thank you for your message. I corrected it.

…

On Mon, Nov 11, 2024 at 12:30 PM github-actions[bot] < ***@***.***> wrote: Hello, thank you for your contribution. If you are a participant, please close this pull request and open it in your own forked repository instead of here. Please read the instructions on your onboarding Assignment Submission Guide <https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md> more carefully. If you are not a participant, please give us up to 72 hours to review your PR. Alternatively, you can reach out to us directly to expedite the review process. — Reply to this email directly, view it on GitHub <#79 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BKSZRP65SPKN6TDJUV24QG32ADSSNAVCNFSM6AAAAABRSJSA5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRYGY4TIMJUGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- *Marjan Rajabi* PhD in Statistics Statistical Researcher & Data Scientist Phone: (437) 473-2820

Complete implementation of assignment_2 model pipeline and metrics

7268de1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete implementation of assignment_2 model pipeline and metrics #79

Complete implementation of assignment_2 model pipeline and metrics #79

marjanrajabi437 commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

marjanrajabi437 commented Nov 11, 2024 via email

Complete implementation of assignment_2 model pipeline and metrics #79

Are you sure you want to change the base?

Complete implementation of assignment_2 model pipeline and metrics #79

Conversation

marjanrajabi437 commented Nov 11, 2024

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

What did you learn from the changes you have made?

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

How were these changes tested?

A reference to a related issue in your repository (if applicable)

Checklist

github-actions bot commented Nov 11, 2024

marjanrajabi437 commented Nov 11, 2024 via email