Capstone Project for Data Science
Our final project asks you to apply your skills to a business problem of your choice. The capstone is an opportunity for you to demonstrate your new skills and tackle a pressing issue relevant to your team, division, or organization. You’ll generate a hypothesis, analyze internal data, and generate a working model, prototype, solution, or recommendation.
You will get structured guidance and designated time to work throughout the course. Final project deliverables include:
- Part 1: Project Proposal/Data: Select a data set and fill out the notebook provided in this repository. The goal of this portion is to get approval from the instructional team for your data. If you have any questions regarding the viability of your data, please reach out. If the team decides your dataset is outside the bounds of a sensible capstone, you will need to redo this part. See the included notebook for what a good data set looks like.
- Part 2: Exploratory Data Analysis (EDA): Share a summary of your initial analysis. Notebook should contain a thorough analysis and description of your data as well as relevant graphics. The goal is that the reader underderstands the shape and nature of the data you are working with.
- Part 3: Modeling/Results: Submit a cleanly formatted Jupyter notebook (or other files) documenting your code, processes, model, and results. The format of this notebook should resemble a medium-length memo.
- Part 4: Presentation: Present a summary of your business problem, approach, and recommendation to an audience of non-technical executive stakeholders.
Your project is meant to resemble an actual client deliverable. Your audience is a client. Parts 2 and 3 should be clean notebooks that you would feel comfortable presenting to a client. Part 4, the presentation, should resemble an actual client meeting in which you present your findings.
"Everyone is your client." -Dr. Albert Lee
- Check out our requirements doc for a detailed walkthrough of our final project deliverable requirements.
- If you're looking for final project ideas, Kaggle is an excellent resource for free and open data. As always, you may not use work data for this project! It is also wise to not use anything related to your current work projects
For due dates, please see the Due Dates
section of your Course Info repository.
Part 1 is on a 0/1 scale (it is complete or it is not). For all deliverables (ie, Parts 2, 3, and 4 above), requirements will be evaluated on a simple point scale of 0, 1, or 2. Additionally, instructors will provide you with feedback on required portions of your project.
Score | Expectations |
---|---|
0 | Incomplete. |
1 | Does not meet expectations. |
2 | Meets expectations, good job! |
3 | Surpasses our wildest expectations! |
Note: Scores of
2
mean that a requirement has been completely fulfilled, while3
is typically reserved for bonus objectives.
Your final project is therefore out of a total of 7 points. A 5 or higher is required for passing this assessment, and passing this assessment is required for passing this course. Do not count on receiving 3s, as they are given subjectively to above-and-beyond analyses.
Parts 1, 2, and 3 are submitted the same way as our previous projects. You are to fork this repository, edit it, and push it to your own personal page.
- Part 1: You will fill our the supplied notebook in this repository.
- Part 2: You will being from a blank notebook and include it in this repository.
- Part 3: You will being from a blank notebook and include it in this repository.
- Part 4: Your presentation in the last week of class is your submission. You do not need to submit your slidedeck.