Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify custom (non-default) schemas for streams? #163

Open
bdewilde opened this issue Jul 19, 2023 · 4 comments
Open

Specify custom (non-default) schemas for streams? #163

bdewilde opened this issue Jul 19, 2023 · 4 comments

Comments

@bdewilde
Copy link

I'm currently using the "transferwise" target-postgres variant's schema_mapping setting to load data from different source dbs into their own schemas. This has been useful to keep data well-organized in the target postgres db.

I'm interested in switching to this variant, but don't see this option mentioned in the docs or in plans for future support. Would it be possible to add this functionality?

@visch
Copy link
Member

visch commented Sep 27, 2023

@bdewilde clarified that what he wants here is to be able to push all data from a tap like tap-mysql to one schema. Right now the target stripes the databasename-streamname , databasename- is filtered out automatically

@visch
Copy link
Member

visch commented Sep 27, 2023

@bdewilde clarified that what he wants here is to be able to push all data from a tap like tap-mysql to one schema. Right now the target stripes the databasename-streamname , databasename- is filtered out automatically

Code from the SDK that does this is here https://github.com/meltano/sdk/blob/main/singer_sdk/sinks/sql.py#L73-L81

@edgarrmondragon
Copy link
Member

Requested in Slack: https://meltano.slack.com/archives/C01TCRBBJD7/p1699541259229359

@florian-ernst-alan
Copy link

Indeed that would be great. We have around 800 tables in our backend DB (spread over 10 to 15 schemas) and being able to map source_schema to target_schema would be a blessing.

Right now, our workaround is to split the tap in several sub-taps, all needing to be individually scheduled - it's a pain, but I didn't find a better way. Toy example:

plugins:
  extractors:
  - name: tap-postgres--prod
    inherit_from: tap-postgres
    config:
      host: xx
      port: 5432
      database: xx
    load_schema: backend
    select:
      - "*.*"
      # Ignored tables
      - "!public-activity.*"
      - "!public-pg_*.*"
      # Ignored schemas
      - "!auth-*.*"
      - "!misc-*.*"
    metadata:
      "*":
        replication-method: LOG_BASED
  - name: tap-postgres--prod-auth
    inherit_from: tap-postgres--prod
    load_schema: backend_auth
    select:
      - "auth-*.*"
    metadata:
      "*":
        replication-method: LOG_BASED
  - name: tap-postgres--prod-misc
    inherit_from: tap-postgres--prod
    load_schema: backend_misc
    select:
      - "misc-*.*"
    metadata:
      "*":
        replication-method: LOG_BASED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

4 participants