Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saucier self-rating / PDA matrix #19

Open
jwzimmer-zz opened this issue Nov 8, 2021 · 1 comment
Open

Saucier self-rating / PDA matrix #19

jwzimmer-zz opened this issue Nov 8, 2021 · 1 comment

Comments

@jwzimmer-zz
Copy link
Owner

Data: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GHYMEV

There was missing data, which I filled using the average over the rest of the scores.

First dimension
image

Second dimension
image

Third dimension
image

Fourth dimension
image

Fifth dimension
image

Sixth dimension
image

Seventh dimension
image

Eighth dimension
image

Ninth dimension
image

Tenth dimension
image

Eleventh dimension
image

Twelfth dimension
image

Thirteenth dimension
image

Fourteenth dimension
image

Fifteenth dimension
image

Sixteenth dimension
image

Seventeenth dimension
image

Eighteenth dimension
image

Nineteenth dimension
image

Twentieth dimension
image

The sigma values
image

First 20 sigma values
image

First 10 sigma values
image

So they drop off quickly!

The first 3 columns of U
image

The first 3 columns of U for the first 20 terms
image

To reproduce:
Download the file as .tab from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GHYMEV and load it as a dataframe:
trysauce = pd.read_csv("525_PDA-1.tab",sep="\t")
Drop the column that has the IDs of the participants:
trysauce2 = trysauce.drop('ID',axis=1)
Subtract the mean:
trysauce3 = trysauce2 - trysauce2.mean().mean()
Find missing values:
trysauce3.isnull().any()
Fill missing values with the mean *note: I should have used the new mean after the mean was removed instead of the original mean:
trysauce4 = trysauce3.fillna(trysauce2.mean().mean())
Run SVD: t3df, t3u, t3d, t3v, t3sig, t3x, t3remakex = runSVD(trysauce4)

The mean of the df after the original mean was removed:

m = trysauce3.mean().mean() = -9.42272714880909e-15
trysauce5 = trysauce3.fillna(m)
t3df, t3u, t3d,  t3v, t3sig, t3x, t3remakex = runSVD(trysauce5)

Rerunning SVD using that value doesn't really seem to make any difference (yay), so I'm not going to redo everything:
image
image

@jwzimmer-zz
Copy link
Owner Author

To do:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant