Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Addition #24

Open
jeroenvermunt opened this issue Jun 1, 2022 · 3 comments
Open

Addition #24

jeroenvermunt opened this issue Jun 1, 2022 · 3 comments

Comments

@jeroenvermunt
Copy link

Are we interested in a method which pulls responses in the format below.

{
 'question name':'answer'
}

This seems a lot easier for developing with the typeform responses in the backend.

@henrihapponen
Copy link

I'd also like to see this feature, or something like an option to pull response answers directly as a Pandas data frame or a list of dictionaries.
It would still have question id's as column names but then you could just provide a dictionary that maps each question id to some column name that describes the question.

@jeroenvermunt
Copy link
Author

Not quite a full feature yet but I wrote a function which maps question names and answers to ids, I use it for myself and could maybe be useful as a start:

def show_question_names(forms, form_id):
    '''show all question with the corresponding id, also shows the answers of multiple choice questions'''
    question_dict = {}
    
    form = forms.get(form_id)
    
    for field in form['fields']:
        
        # store id and question
        question_dict[field['id']] = field['title']
        
        # store answers of multiple choice questions
        if field['type'] == 'multiple_choice':
            question_dict[field['id'] + f'_choices'] = []
            for i, choice in enumerate(field['properties']['choices']):
                question_dict[field['id'] + f'_choices'].append({choice['id']:choice['label']})
                
        # store answers of questions belonging to a group
        if field['type'] == 'group':
            question_dict[field['id'] + f'_sub_questions'] = []
            for i, sub_question in enumerate(field['properties']['fields']):
                question_dict[field['id'] + f'_sub_questions'].append({sub_question['id']:sub_question['title']})
                
    pprint(question_dict)

@henrihapponen
Copy link

Looks good @jeroenvermunt !
I had to make something similar myself. My use case is to pull responses with multiple fields (30-100) periodically and append to a table in a data base. This function does the job of getting the response answers as a data frame (though I can't promise it's perfect):

def get_response_answers_df(form_uid: str, field_id_map: Dict, target_fields: List[str],
                            since: int, until: int):
    """
    Get response answers as a Pandas Data Frame. Note that page size is 1000 (maximum).
    Args:
        form_uid (str): The form uid of the Typeform form to get metadata for.
        field_id_map (dict): The field-id map of the forms fields. So mapping id's to new field names.
        target_fields (list): The fields in the target table.
        since (int): Days (in integers) to get responses since.
        until (int): Days (in integers) to get responses until.
    Returns:
        df (Data Frame): The response answers as a Data Frame.
    """

    responses = typeform.responses.list(uid=form_uid, since=since, until=until, pageSize=1000)
    df_response_items = pd.json_normalize(responses['items'])
    df = df_response_items['answers']
    df = pd.json_normalize(df)

    response_ids = df_response_items['response_id'].tolist()
    landed_at_list = df_response_items['landed_at'].tolist()
    submitted_at_list = df_response_items['submitted_at'].tolist()

    # Rename columns to make them numbered
    for i in range(0, len(df.columns)):
        df.rename(columns={i: f'answer{i+1}'}, inplace=True)

    # Parse answers from nested structure to a flat table
    new_table = []

    for index, row in df.iterrows():
        row_dict = {}
        for i in range(1, len(df.columns) + 1):
            if row[f'answer{i}'] is not None:
                field_type = row[f'answer{i}']['type']

                try:
                    field_id = row[f'answer{i}']['field.id']

                    if field_type == 'choice':
                        # If single-option choice
                        try:
                            row_dict[field_id] = row[f'answer{i}']['choice.label']
                        except KeyError:
                            row_dict[field_id] = row[f'answer{i}']['choice.other']
                    elif field_type == 'choices':
                        # If multiple-option choice
                        try:
                            row_dict[field_id] = ';'.join(row[f'answer{i}']['choices.labels'])
                        except KeyError:
                            row_dict[field_id] = row[f'answer{i}']['choices.other']
                    else:
                        # In other cases, the response value should match the field id
                        row_dict[field_id] = row[f'answer{i}'][field_type]
                except KeyError:
                    pass
            else:
                pass

        new_table.append(row_dict)

    # Re-assign df to new table
    df = pd.DataFrame(new_table)

    # For target columns that have an id but don't exist in this response pull, add empty values
    for column in field_id_map.keys():
        if column not in df.columns:
            df[column] = ''

    # Then rename all columns
    df.rename(columns=field_id_map, inplace=True)

    # Add response_id (name as typeform_id) and submitted_at as columns
    df['typeform_id'] = response_ids
    df['form_start_datetime'] = landed_at_list
    df['form_submit_datetime'] = submitted_at_list

    # Then for all target columns that don't exist in this response pull, add empty values
    for column in target_fields:
        if column not in df.columns:
            df[column] = ''

    # Re-order columns according to target_fields (and add any field_id_map fields that are not in target fields)
    all_fields = target_fields + list(set(field_id_map.values()) - set(target_fields))
    df = df[all_fields]

    # Add potential hidden fields
    hidden_fields = []
    for field in df_response_items.columns:
        if 'hidden.' in field:
            hidden_fields.append(field)

    df_hidden_fields = df_response_items[hidden_fields]
    df = pd.concat([df, df_hidden_fields], axis=1)

    return df

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants