Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract all rows for multi-record forms #106

Open
tashrifbillah opened this issue Jul 1, 2022 · 1 comment
Open

Extract all rows for multi-record forms #106

tashrifbillah opened this issue Jul 1, 2022 · 1 comment
Assignees

Comments

@tashrifbillah
Copy link

Lochness's current design of extracting only latest row for each visit of RPMS forms won't work for multi-record forms. We need to condition this by the existence of Row# column. If Row# is present, extract all rows of each visit.

Take PrescientStudy_Prescient_family_interview_for_genetic_studies_figs_child_01.07.2022.csv for example. We created this record together recently. The subject has two children hence two rows under visit 1. We need to extract both rows.

@tashrifbillah
Copy link
Author

This can probably be solved by just adding an or to the if:

if len(table) == 1 or 'Row#' in subject_df.columns:

if 'visit' in subject_df.columns:
for unique_visit, table in subject_df.groupby('visit'):
if len(table) == 1:
pass
else:
most_recent_row_index = pd.to_datetime(
table['LastModifiedDate']).idxmax()
non_recent_row_index = [x for x in table.index
if x != most_recent_row_index]
print(f'RPMS export has duplicated rows for {measure}')
subject_df.drop(non_recent_row_index, inplace=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants