Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add few more parts to correlation lesson #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 57 additions & 2 deletions lab01_correlation/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
Hint: Try copying the read.csv code from the text above.

- Class: cmd_question
Output: "Next, we shall load the gene expression dataset. rnaseq <- read_csv("data/pannets_expr_rnaseq.csv.gz")'"
Output: "Next, we shall load the RNAseq gene dataset. rnaseq <- read_csv("data/pannets_expr_rnaseq.csv.gz")'"
CorrectAnswer: rnaseq <- read_csv("data/pannets_expr_rnaseq.csv.gz")
AnswerTests: omnitest(correctExpr = 'rnaseq <- read_csv("data/pannets_expr_rnaseq.csv.gz")')
Hint: Try copying the read.csv code from the text above.
Expand All @@ -36,7 +36,62 @@
CorrectAnswer: head(rnaseq)
AnswerTests: omnitest(correctVal = 'head(rnaseq)')
Hint: Check the output of 'head(rnaseq)'

- Class: cmd_question
Output: "Now we need to create pivot table in long form for expression data. columns should be -Gene, names_to should be Tumor and values_to should be Expr. Store it into rnaseq_long."
CorrectAnswer: rnaseq_long <- pivot_longer(rnaseq, cols = -Gene, names_to = "Tumour", values_to = "Expr")
AnswerTests: omnitest(correctExpr = 'rnaseq_long <- pivot_longer(rnaseq, cols = -Gene, names_to = "Tumour", values_to = "Expr")')

- Class: cmd_question
Output: "Use the Pivot table created before to create another pivot table in wide form. Identifier columns should be Tumour, names parameter should be Gene and values should be obtained from Expr."
CorrectAnswer: rnaseq_wide <- pivot_wider(rnaseq_long, id_cols = Tumour, names_from = Gene, values_from = Expr)
AnswerTests: omnitest(correctExpr = 'rnaseq_wide <- pivot_wider(rnaseq_long, id_cols = Tumour, names_from = Gene, values_from = Expr)')


- Class: cmd_question
Output: "Next, we shall load the microarray gene expression dataset. array <- read_csv("data/pannets_expr_array.csv.gz")'"
CorrectAnswer: array <- read_csv("data/pannets_expr_array.csv.gz")
AnswerTests: omnitest(correctExpr = 'array <- read_csv("data/pannets_expr_array.csv.gz")')
Hint: Try copying the read.csv code from the previous commands.


- Class: cmd_question
Output: "Now we need to create pivot table in long form for gene expression data. The pivot table should be stored in array_long."
CorrectAnswer: array_long <- pivot_longer(array, cols = -Gene, names_to = "Tumour", values_to = "Expr")
AnswerTests: omnitest(correctExpr = 'array_long <- pivot_longer(array, cols = -Gene, names_to = "Tumour", values_to = "Expr")')
Hint: Look at the command we executed previously for the idea.

- Class: cmd_question
Output: "Now we create a Pivot table in Long form using the array_long variable. The data should be stored in array_wide."
CorrectAnswer: array_wide <- pivot_wider(array_long, id_cols = Tumour, names_from = Gene, values_from = Expr)
AnswerTests: omnitest(correctExpr = 'array_wide <- pivot_wider(array_long, id_cols = Tumour, names_from = Gene, values_from = Expr)')
Hint: Look at the command we have written to create a wide pivot table for RNASeq data.

- Class: text
Output: Now we compare ACTB gene expression between RNA-seq and microarray data. For this we first need to create a dataframe, create a scatterplot between RNASeq Gene Expression vs microarray gene expression.

- Class: cmd_question
Output: "Lets start by creating a data frame for Tumor vs ACTB gene expression. For this we need to create three columns- tumour, rnaseq and array and obtain data for these columns from the rnaseq_wide and array_wide data for ACTB"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put an overview of our datasets here and review what we've done to prepare each of them, and how the mapping of values is going to occur for when we plot it.

CorrectAnswer: actb <- data.frame(tumour = rnaseq_wide$Tumour, rnaseq = rnaseq_wide$ACTB, array = array_wide$ACTB)
AnswerTests: omnitest(correctExpr = 'actb <- data.frame(tumour = rnaseq_wide$Tumour, rnaseq = rnaseq_wide$ACTB, array = array_wide$ACTB)')



- Class: cmd_question
Output: "Now lets create a scatter plot to find the relation between RNASeq Gene Expression vs Microarray Gene Expression"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand on the purpose of doing this and what the student should look for in the ggplot, what kind of correlations will they see? maybe explain basics around scatterplots and what they are best used for.

CorrectAnswer: ggplot(actb, aes(x = rnaseq, y = array)) + geom_point() + labs(title = "ACTB expression")
AnswerTests: omnitest(correctExpr = 'ggplot(actb, aes(x = rnaseq, y = array)) + geom_point() + labs(title = "ACTB expression")')

- Class: text
Output: For calculating correlations between two variables, we have the cor() function which takes as parameters the variables between which the correlations need to be calculated and the method of calculation for correlation.

- Class: cmd_question
Output: "Let us first calculate the correlation between RNAseq and microarray using pearson method."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain Pearson vs Spearman correlation in more detail, what are advantages to each method? Have the student compare the outputs to both and how it may affect their interpretation of the correlation.

CorrectAnswer: cor(actb$rnaseq, actb$array, method = "pearson")
AnswerTests: omnitest(correctExpr = 'cor(actb$rnaseq, actb$array, method = "pearson")')

- Class: cmd_question
Output: "Let us first calculate the correlation between RNAseq and microarray using spearman method."
CorrectAnswer: cor(actb$rnaseq, actb$array, method = "pearson")
AnswerTests: omnitest(correctExpr = 'cor(actb$rnaseq, actb$array, method = "spearman")')