comments and feedbacks #9

SaraMati · 2019-02-24T19:29:07Z

Thoughts from Shreejoy Tripathy February 23, 2019:

Here's some quick comments about the course syllabus you've outlined. Overall, I really like the idea of the course and would be happy to be involved, including as an co-instructor. Do you have an idea for the course title? Perhaps something like: "An introduction to data science"? In general, I'm even more convinced that as many scientists as possible should take a course like this.

I really like the headings under Programming, including basic python, data wrangling and tidying. One thing I would add is how to properly use and format spreadsheets.

For quantitative methods, I would need to see these spelled out more in terms of topics and lectures. I don't have a good sense for how many lectures you would devote to statistics, but I think it's very important. Probably like at least 3-4. These would include topics like statistical philosophy and what is a random variable, distributions, etc. Also two group comparisons (t-tests and non-parametric tests). Regression, multivariate regression, possibly lasso, feature selection, comparing models using anova, AIC/BIC. I quite like this syllabus, and think it would be a good guideline for the statistics stuff: https://stat540-ubc.github.io/subpages/lectures.html

I think you should think very hard about who your prototypical student is. I personally think this course can be really great and effective with absolutely no time devoted to time series analysis. Obviously time series analysis is important, but I personally would prioritize the basics (so maybe 1 lecture only on time series). Similarly, [having] a class on basic data tidying, some plotting, and t-tests, and possibly ... with git/github ...

I absolutely LOVE the idea of the project. Essential for a course like this. In order to have the projects be effectively supervised/managed, it'll be important to get TAs (probably 1 TA for 3-4 projects) with regular update meetings.

Sean Hill, Feb 15 meeting

suggested we can also try offering the course through IMS that overarches many departments. and he will support in any ways we need him
offered the conference room on 12th floor at Krembil neuroinformatic institute for the location of the course
suggested one topic to be "data modeling" or data schemas based on https://github.com/BlueBrain/nexus/tree/master/src/main/paradox/docs/tutorial

Popovic, Jan 31 meeting
Other than full support, the main comment was to make sure we describe how the course is sustainable after the first round of instructors (us)

SaraMati · 2019-02-24T19:39:36Z

my comments to Shreejoy's email:
regarding the spreadsheet lesson: I feel this is going backwards! unless he means having a lesson on how to properly store data in general, I had such content in mind under the tidy data title. also how to clean up data, etc.

regarding quantitative methods: He is not saying anything different than we had in mind already. In general, I think we should include stats because our goal is that the students should be able to provide a scientific report, and I think stressing that a proper scientific conclusion should be based on proper stats is important. To be able to cover them all in the live-participatory coding sessions, I think we can provide them with notes and resources about the concepts and focus on teaching the coding and applying to examples. We should have in mind that we are not a stats course, or a machine learning course, so we are not there to teach them the concepts, but how to not use them in a sloppy way. "how and where to apply which". Having in mind our target population helps: I'm doing this course for my first-year-grad me. we can think of students in CPIN (collaborative program in neuroscience): from different engineering fields, physiology, engineering science, psychology, etc. There are good courses in statistics in all those departments. and at the end, one course can't teach all the methods that they may need in the course projects. so the assumption is that they either have heard the concepts before, or can read on their own.

well, collaborators such as post docs with physiology background can audit and won't need to have the project.
I don't insist in having time series, but overall we may benefit from reusing material from the rcourse.

regarding regular meetings with TAs: yes, we mentioned this in the end of year meeting for the Rcourse this year.

joelostblom · 2019-02-24T20:17:55Z

I haven't seen the email, but maybe you can include what we taught during the Python workshops last summer? So a more focused version of the data carpentry spreadsheet section covering what are good general data practices as you said and what spreadsheets are good for (e.g. data entry) and what their limits are (e.g. data analysis).

…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comments and feedbacks #9

comments and feedbacks #9

SaraMati commented Feb 24, 2019 •

edited by lwjohnst86

Loading

SaraMati commented Feb 24, 2019 •

edited

Loading

joelostblom commented Feb 24, 2019 via email

comments and feedbacks #9

comments and feedbacks #9

Comments

SaraMati commented Feb 24, 2019 • edited by lwjohnst86 Loading

SaraMati commented Feb 24, 2019 • edited Loading

joelostblom commented Feb 24, 2019 via email

SaraMati commented Feb 24, 2019 •

edited by lwjohnst86

Loading

SaraMati commented Feb 24, 2019 •

edited

Loading