Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comments and feedbacks #9

Open
SaraMati opened this issue Feb 24, 2019 · 2 comments
Open

comments and feedbacks #9

SaraMati opened this issue Feb 24, 2019 · 2 comments

Comments

@SaraMati
Copy link
Member

SaraMati commented Feb 24, 2019

Thoughts from Shreejoy Tripathy February 23, 2019:

Here's some quick comments about the course syllabus you've outlined. Overall, I really like the idea of the course and would be happy to be involved, including as an co-instructor. Do you have an idea for the course title? Perhaps something like: "An introduction to data science"? In general, I'm even more convinced that as many scientists as possible should take a course like this.

I really like the headings under Programming, including basic python, data wrangling and tidying. One thing I would add is how to properly use and format spreadsheets.

For quantitative methods, I would need to see these spelled out more in terms of topics and lectures. I don't have a good sense for how many lectures you would devote to statistics, but I think it's very important. Probably like at least 3-4. These would include topics like statistical philosophy and what is a random variable, distributions, etc. Also two group comparisons (t-tests and non-parametric tests). Regression, multivariate regression, possibly lasso, feature selection, comparing models using anova, AIC/BIC. I quite like this syllabus, and think it would be a good guideline for the statistics stuff: https://stat540-ubc.github.io/subpages/lectures.html

I think you should think very hard about who your prototypical student is. I personally think this course can be really great and effective with absolutely no time devoted to time series analysis. Obviously time series analysis is important, but I personally would prioritize the basics (so maybe 1 lecture only on time series). Similarly, [having] a class on basic data tidying, some plotting, and t-tests, and possibly ... with git/github ...

I absolutely LOVE the idea of the project. Essential for a course like this. In order to have the projects be effectively supervised/managed, it'll be important to get TAs (probably 1 TA for 3-4 projects) with regular update meetings.

Sean Hill, Feb 15 meeting

  • suggested we can also try offering the course through IMS that overarches many departments. and he will support in any ways we need him
  • offered the conference room on 12th floor at Krembil neuroinformatic institute for the location of the course
  • suggested one topic to be "data modeling" or data schemas based on https://github.com/BlueBrain/nexus/tree/master/src/main/paradox/docs/tutorial

Popovic, Jan 31 meeting
Other than full support, the main comment was to make sure we describe how the course is sustainable after the first round of instructors (us)

@SaraMati
Copy link
Member Author

SaraMati commented Feb 24, 2019

my comments to Shreejoy's email:
regarding the spreadsheet lesson: ​I feel this is going backwards! unless he means having a lesson on how to properly store data in general, I had such content in mind under the tidy data title. also how to clean up data, etc.

regarding quantitative methods: He is not saying anything different than we had in mind already. In general, I think we should include stats because our goal is that the students should be able to provide a scientific report, and I think stressing that a proper scientific conclusion should be based on proper stats is important. To be able to cover them all in the live-participatory coding sessions, I think we can provide them with notes and resources about the concepts and focus on teaching the coding and applying to examples. We should have in mind that we are not a stats course, or a machine learning course, so we are not there to teach them the concepts, but how to not use them in a sloppy way. "how and where to apply which". Having in mind our target population helps: I'm doing this course for my first-year-grad me. we can think of students in CPIN (collaborative program in neuroscience): from different engineering fields, physiology, engineering science, psychology, etc. There are good courses in statistics in all those departments. and at the end, one course can't teach all the methods that they may need in the course projects. so the assumption is that they either have heard the concepts before, or can read on their own.

​well, collaborators such as post docs with physiology background can audit and won't need to have the project.
I don't insist in having time series, but overall we may benefit from reusing material from the rcourse.

regarding regular meetings with TAs: ​yes, we mentioned this in the end of year meeting for the Rcourse this year.

@joelostblom
Copy link
Member

joelostblom commented Feb 24, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants