Improve minor inconsistencies. #142

mkcor · 2018-10-15T18:14:56Z

I've been studying this example as I'm planning on drawing on it and citing it at an upcoming seminar I'll be giving (maybe I should post about it in Discourse, in the spirit of this recent comment). I have emailed author @sylee30 with a question I may have figured meanwhile...

Anyway, I'm suggesting the following edits after running the R code and considering that 'knowledge' is used in an abstract way (knowledge is represented by attribute/skill mastery).

Best,
Marianne

bob-carpenter

Thanks for looking this stuff over! The code changes are good, but I'm reluctant to change the original authors' language on mastery vs. knowledge given the definitions.

bob-carpenter · 2018-10-15T20:40:20Z

education/dina_independent/dina_independent.Rmd

@@ -41,9 +41,9 @@ In educational measurement, cognitive diagnosis models (CDMs) have been used to

 The *deterministic inputs, noisy "and"" gate* (DINA) model [@Junker2001] is a popular conjunctive CDM, which assumes that a respondent must have mastered all required attributes in order to correctly respond to an item on an assessment.

-To estimate respondents' knowledge of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment. 


Seung Yeon Lee's original text says:

The presence and absence of attributes are referred to as “mastery” and “non-mastery” respectively. A respondent’s knowledge is represented by a binary vector, referred to as “attribute profile”, to indicate which attributes have been mastered or have not.

So I think either "knowledge" or "mastery" would work here (knowledge being the set of attributes mastered), so I'd be reluctant to change this.

Right, so attributes are 'mastered' (not 'known'). The high-level concept of knowledge is characterized by (low-level) attribute mastery. I'm going by the author's terminology. It's not a big deal, it's just that using another term to mean the same thing or using metonymy (as is the case here) can create unnecessary extra cognitive load, if not confusion. And you don't want this in materials which are already difficult to digest. But, if you insist, I will, of course, revert this change.

I also find this terminology confusing. It's doubly confusing because it's trying to overload two commonly used nouns with technical meanings. The original author of the DINA model introduced the terminology, it looks like. We're often walking the line between our general preferences for naming, etc., and that of a subfield we work with. Andrew's particularly prone to try to change entrenched terminology.

bob-carpenter · 2018-10-15T20:41:07Z

education/dina_independent/dina_independent.Rmd


-A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s knowledge of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. 
+A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s mastery of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes. 


[doesn't need to be fixed for this PR]

The lines should be max 80 characters if possible because otherwise we get this mess in Git with edits where we are looking at 300 character lines.

I completely agree. The command-line diff is also showing me that there is a trailing whitespace, so I'll fix both formatting issues at the same time.

bob-carpenter · 2018-10-15T20:42:22Z

education/dina_independent/dina_independent.Rmd

[not required for this PR]

Are these lines different? You can also just specify whether sections are numbered in the top-level yaml.

Hi @bob-carpenter ! Thanks for reviewing.
I didn't mean to submit this change specifically; it's the text editor (in this case, GitHub's editor, but Vim would do the same thing) which adds a "newline at end of file" whenever it's missing.
Indeed, that's what the command-line diff looks like:

- \ No newline at end of file +

I would think it's a healthy change although very minor.

Ah, missed that---thanks!

mkcor · 2018-10-16T11:11:32Z

education/dina_independent/dina_independent.Rmd

@@ -110,9 +115,9 @@ $$
 \mathrm{Pr}(\alpha_{jk}=1 \, | \, \boldsymbol{y}_j)=\sum_{c=1}^{C}\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c} \, | \, \boldsymbol{y}_j)\times\alpha_{ck}.
 $$

-Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $s$ be denoted $\boldsymbol{\theta}^{(s)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as
+Instead of conditioning on the parameters $\nu_c,s_i,g_i$ to obtain $\mathrm{Pr}(\boldsymbol{\alpha_j}=\boldsymbol{\alpha_c}|\boldsymbol{Y}_j=\boldsymbol{y}_j)$, we want to derive the posterior probabilities, averaged over the posterior distribution of the parameters. This is achieved by evaluating the expressions above for posterior draws of the parameters and averaging these over the MCMC iterations. Let the vector of all parameters be denoted $\boldsymbol{\theta}$ and let the posterior draw in iteration $t$ be denoted $\boldsymbol{\theta}^{(t)}_{.}$ Then we estimate the posterior probability, not conditioning on the parameters, as


@bob-carpenter Here, the change is only about using $t$ rather than $s$ to denote iteration, because $s$ has been used for 'slipping' in the rest of the text.

Improve minor inconsistencies

9bdfbd4

bob-carpenter requested changes Oct 15, 2018

View reviewed changes

mkcor added 3 commits October 16, 2018 13:01

Use fresh notation for iteration index

a475db4

Remove unnecessary Mac-specific file

d8c08ff

Break line which was >80-character-long

c205ddf

mkcor commented Oct 16, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve minor inconsistencies. #142

Improve minor inconsistencies. #142

mkcor commented Oct 15, 2018

bob-carpenter left a comment

bob-carpenter Oct 15, 2018

mkcor Oct 16, 2018

bob-carpenter Oct 16, 2018

bob-carpenter Oct 15, 2018

mkcor Oct 16, 2018

bob-carpenter Oct 15, 2018

mkcor Oct 16, 2018

bob-carpenter Oct 16, 2018

mkcor Oct 16, 2018

		@@ -41,9 +41,9 @@ In educational measurement, cognitive diagnosis models (CDMs) have been used to

		The deterministic inputs, noisy "and"" gate (DINA) model [@Junker2001] is a popular conjunctive CDM, which assumes that a respondent must have mastered all required attributes in order to correctly respond to an item on an assessment.

		To estimate respondents' knowledge of attributes, we need information about which attributes are required for each item. For this, we use a Q-matrix which is an $I \times K$ matrix where $q_{ik}$=1 if item $i$ requires attribute $k$ and 0 if not. $I$ is the number of items and $K$ is the number of attributes in the assessment.


		A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s knowledge of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes.
		A binary latent variable $\alpha_{jk}$ indicates respondent $j$'s mastery of attribute $k$, where $\alpha_{jk}=1$ if respondent $j$ has mastered attribute $k$ and 0 if he or she has not. Then, an underlying attribute profile of respondent $j$, $\boldsymbol{\alpha_j}$, is a binary vector of length $K$ that indicates whether or not the respondent has mastered each of the $K$ attributes.

Improve minor inconsistencies. #142

Are you sure you want to change the base?

Improve minor inconsistencies. #142

Conversation

mkcor commented Oct 15, 2018

bob-carpenter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment