Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source SurveyMonkey: Tables Have No Relatability, Impossible to Join #30323

Closed
1 task
ryancodepath opened this issue Sep 11, 2023 · 9 comments
Closed
1 task

Comments

@ryancodepath
Copy link

Connector Name

surveymonkey

Connector Version

0.2.2

What step the error happened?

Other

Revelant information

No foreign keys on tables. Impossible to relate tables being made.

Relevant log output

No response

Contribute

  • Yes, I want to contribute
@ryancodepath ryancodepath added area/connectors Connector related issues needs-triage type/bug Something isn't working labels Sep 11, 2023
@marcosmarxm marcosmarxm changed the title SurveyMonkey Tables Have No Relatability, Impossible to Join Source SurveyMonkey: Tables Have No Relatability, Impossible to Join Sep 15, 2023
@bmcgilli bmcgilli moved this to To Be Claimed in Hacktoberfest 2023 Sep 28, 2023
@domzae
Copy link
Contributor

domzae commented Oct 2, 2023

This is not true, they definitely can be related - but it is admittedly quite complex to do so, requiring many joins and un-nesting json structures.

@KimuchuJr
Copy link

Hello.

I would want to give it a try.

@justinbchau
Copy link
Collaborator

@KimuchuJr I have already assigned you an issue so I will leave this one open to give others a fair chance! If you raise a PR for your issue and this is still open, I can assign.

@topefolorunso
Copy link
Collaborator

@ryancodepath Can you be more specific with what you were trying to do? What tables are you trying to relate? And what do you think an ideal solution would be?

@ryancodepath
Copy link
Author

@topefolorunso thanks for responding back. At this moment, we've moved on to paying surveymonkey for their connector because this isn't working. Without spending too much time back tracking: there is no way to relate the questions, nor answers, nor pages, back to a survey. Ideally, there would be a survey ID that relates to a pages ID, that relates to a questions ID, that relates to an answers ID, etc. I read their api docs extensively and it appeared that there was a central, pivotal table that wasn't being captured by airbyte that whomever was on the office hours call noticed. I can't remember which table it was, but it had the link to the different tables.

@domzae
Copy link
Contributor

domzae commented Oct 16, 2023

  • survey_responses has a FK survey_id to connect to surveys
  • survey_responses has pages.questions.answers, all nested in JSON
  • survey_questions has the questions which can be linked with idsurvey_responses.pages.questions.id
  • survey_questions has the predefined answers which can be linked with choice.idpages.questions.answers.choice_id in survey_responses

It is far more complicated than it should be, but I don't think there's much Airbyte can do about that, as the SurveyMonkey API is what it is, and adding transformations will make the the sync not 1:1 with the API.

@topefolorunso
Copy link
Collaborator

@ryancodepath Please consider @domzae's comment. To unnest the complicated relationships in the nested data, you can run transformations in your destination warehouse or database with dbt. Airbyte has provisions for that here. You can also consider transformation with SQL or transformation with Airbyte.

@topefolorunso
Copy link
Collaborator

That said @justinbchau, I think we can close this issue as there is no problem to fix here

@justinbchau justinbchau moved this from To Be Claimed to Closed in Hacktoberfest 2023 Oct 18, 2023
@github-project-automation github-project-automation bot moved this from Closed to Done in Hacktoberfest 2023 Oct 18, 2023
@domzae
Copy link
Contributor

domzae commented Oct 20, 2023

I think I see what the problem is, there are 3 streams (Surveys, SurveyPages, SurveyQuestions) which all derive from the same endpoint (/surveys/{id}/details).
SurveyPages is the field pages in the response, and SurveyQuestions is the field pages.questions in the response.

  • A quick fix would be to add a survey_id from the stream_slice.
  • A workaround (without needing a change to the connector) is to grab the survey_id from the href field
  • A bigger job:
    1. Migrate the connector to Low Code CDK
    2. Don't unnest SurveyPages and SurveyQuestions into their own streams (this would be a breaking change)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

8 participants