-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission: 5: Predicting Breast Cancer With Multiple Classification Algorithms #5
Comments
Data analysis review checklistReviewer: <GITHUB_USERNAME>Conflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 2.5Review Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
Overall, they clearly put a lot of work into the project, I'm especially impressed by the reliable workflow they created. AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Data analysis review checklistReviewer: @poddarswakharConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1.2 HoursReview Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above. 1.) Style guidelines: I believe there is a roam for improvement for the style guidelines on the script files, to be more specific like commenting part of codes to explain briefly what a chunk of code is doing, for easy understanding and following the code. 2.) For the data part, couldn't find the source of the raw data, in the make file it's more of like reading the CSV from the directory and then doing the analysis, for some readers this might not be super transparent. 3.) In the analysis file couldn't find the authors, so couldn't check that box 4.) Overall well done, really loved the analysis and the methodology used! Loved the use of pipelines, making some of the code simpler and avoiding redundancy. AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Data analysis review checklistReviewer:Conflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1.5Review Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above. -overall great job! Your report was interesting to read, and easy to follow.
|
Data analysis review checklistReviewer: nkodaConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1.5 HoursReview Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Submitting authors: @edile47 @clichyclin @nhantien @ClaudioETC
Repository: https://github.com/DSCI-310/DSCI-310-Group-5
Abstract/executive summary:
The project seeks to provide a solution to the prediction problem of spotting benign and malignant tumors, which comes from the question "Is there a way to efficiently classify whether a tumor is malignant or benign with high accuracy, given a set of different features observed from the tumor in its development stage?". Such problem was resolved using a predictive model. Our initial hypothesis was that it is possible to do so yet it would have a high error rate due to tumors features' variations. After performing EDA, such as summary statistics and data cleaning and visualization, we were able to spot some clear distinctions between benign and malignant tumors in some features. We then tested multiple different classification models and arrived at a K-Nearest-Neighbor model with tuned hyperparameters with very good accuracy, recall, precision and f1 score.
Editor: @ttimbers
Reviewer: @TimothyZG @hmartin11 @poddarswakhar @nkoda
The text was updated successfully, but these errors were encountered: