You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The other day, a user of the conn_icc.py function reported obtaining negative ICC values in a case where the data shouldn’t have had wildly different results. This prompted the following question:
Is the error in how PyReliMRI estimates ICCs?
Is it an issue with how the data are input or due to problems in the data itself?
Before diving into time-consuming sanity checks, the first thing to check is whether the lists provided are in the correct order. Specifically, the multi-session lists you provide to PyReliMRI's ICC estimates require that the subjects, sessions, and runs are in the exact same order. For example:
I was reminded by my colleagues that it is always important to include an assert statement to ensure elements of your path align... assuming they do based on glob, or something else, sometimes fails. If the lists are out of order, you will get incorrect estimates.
Once you confirm the lists are ordered correctly, the next step is to determine whether there is another issue. PyReliMRI computes ICC estimates by decomposing variances across a long dataframe (sessions stacked for subjects). Although the lme.mixedlm() function in statsmodels could have been used, it can be slower and encounters convergence issues when ICC values are near zero or negative.
Note that negative ICC values are common in some datasets. This can happen when the within-subject variance exceeds the between-subject variance, resulting in a negative numerator (since the formula subtracts variances). See Liljequist et al. (2019, page 7), "...A negative ICC(1) is simply a bad (unfortunate) estimate"
To confirm that PyReliMRI's estimates align with common approaches, we compared results from PyReliMRI to ICC(1), ICC(2,1), and ICC(3,1) using both the psych package in R and the pingouin package in Python. We also compute a test to assert that PyReliMRI's ICC(3,1) estimates == lme.mixedlm() estimates. You can see the test script here.
To perform your own sanity checks, you can use the following approach. Extract a single voxel, ROI, or edge from your subjects and sessions, and stack the values into a long dataframe, with:
Column 1: "subidr" (subjects or targets)
Column 2: "sess" (sessions, measurement occasions, or raters)
Column 3: "vals" (estimated values, scores, or ratings)
Example:
importnumpyasnpimportpandasaspdfrompingouinimportintraclass_corrfrompyrelimri.iccimportsumsq_iccimportstatsmodels.formula.apiaslme# Load your data with stacked values for single example from subjects/sessionslong_df=pd.read_csv('path_to_your_data.csv')
# Fit the linear mixed effect model for ICC(3,1)lmmod=lme.mixedlm("vals ~ sess", long_df, groups=long_df["subidr"], re_formula="~1")
lmmod=lmmod.fit()
# Extract the variance components from the modellmmod_btwnvar=lmmod.cov_re.iloc[0, 0]
lmmod_wthnvar=lmmod.scalelmmod_icc3=lmmod_btwnvar/ (lmmod_btwnvar+lmmod_wthnvar)
# Run the ICC(3,1) using PyReliMRI for the same dataicc3_pyrelimri=sumsq_icc(df_long=long_df, sub_var='subidr',
sess_var='sess', value_var='vals', icc_type='icc_3')
iccmod_btwnvar=icc3_pyrelimri[3]
iccmod_withinvar=icc3_pyrelimri[4]
iccmod_icc3=icc3_pyrelimri[0]
# Compare the resultslm_out=np.array([lmmod_btwnvar, lmmod_wthnvar, lmmod_icc3])
pyreli_out=np.array([iccmod_btwnvar, iccmod_withinvar, iccmod_icc3])
print(lm_out, pyreli_out)
# Loop over the three ICC components and compare with `pingouin` packagepyreli_iccs= []
foricc_typein ['icc_1', 'icc_2', 'icc_3']:
est_icc=sumsq_icc(df_long=long_df, sub_var='subidr',
sess_var='sess', value_var='vals', icc_type=icc_type)[0]
pyreli_iccs.append(est_icc)
# Print results for comparisonprint("PyReliMRI ICCs:", pyreli_iccs)
print("Pingouin ICCs:", intraclass_corr(data=long_df, targets='subidr', raters='sess',
ratings='vals')['ICC'][0:3].to_frame().T)
In the discussion with the user, they discovered that their error was that the paths were not ordered correctly. Fewh! To date, I have avoided implementing checks as I know the provided files may have a range of naming rules. I may work this into the workflow as an option in the future.
I hope this is useful to a future user. If you run into errors, definitely post them in issues. Nothing is ever perfect and can always be made improved.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
The other day, a user of the
conn_icc.py
function reported obtaining negative ICC values in a case where the data shouldn’t have had wildly different results. This prompted the following question:Before diving into time-consuming sanity checks, the first thing to check is whether the lists provided are in the correct order. Specifically, the multi-session lists you provide to PyReliMRI's ICC estimates require that the subjects, sessions, and runs are in the exact same order. For example:
I was reminded by my colleagues that it is always important to include an assert statement to ensure elements of your path align... assuming they do based on
glob
, or something else, sometimes fails. If the lists are out of order, you will get incorrect estimates.Once you confirm the lists are ordered correctly, the next step is to determine whether there is another issue. PyReliMRI computes ICC estimates by decomposing variances across a long dataframe (sessions stacked for subjects). Although the
lme.mixedlm()
function instatsmodels
could have been used, it can be slower and encounters convergence issues when ICC values are near zero or negative.Note that negative ICC values are common in some datasets. This can happen when the within-subject variance exceeds the between-subject variance, resulting in a negative numerator (since the formula subtracts variances). See Liljequist et al. (2019, page 7), "...A negative ICC(1) is simply a bad (unfortunate) estimate"
To confirm that PyReliMRI's estimates align with common approaches, we compared results from PyReliMRI to ICC(1), ICC(2,1), and ICC(3,1) using both the
psych
package in R and thepingouin
package in Python. We also compute a test to assert that PyReliMRI's ICC(3,1) estimates ==lme.mixedlm()
estimates. You can see the test script here.To perform your own sanity checks, you can use the following approach. Extract a single voxel, ROI, or edge from your subjects and sessions, and stack the values into a long dataframe, with:
Example:
In the discussion with the user, they discovered that their error was that the paths were not ordered correctly. Fewh! To date, I have avoided implementing checks as I know the provided files may have a range of naming rules. I may work this into the workflow as an option in the future.
I hope this is useful to a future user. If you run into errors, definitely post them in issues. Nothing is ever perfect and can always be made improved.
Beta Was this translation helpful? Give feedback.
All reactions