Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Salesforce] field explosion drops data, but does not fail sync #2968

Open
seanstory opened this issue Nov 15, 2024 · 0 comments
Open

[Salesforce] field explosion drops data, but does not fail sync #2968

seanstory opened this issue Nov 15, 2024 · 0 comments

Comments

@seanstory
Copy link
Member

Bug Description

By default, the Salesforce connector grabs a huge number of fields. If you are not using any sync rules or ingest processors to drop data, this (combined with our default mappings that add multiple sub-fields for text fields) can result in a mapping explosion that reaches 1000 fields. This starts to lead to a lot of ERROR logs while syncing like:

[FMWK][20:05:52][ERROR] [Connector id: x9RTLJMBuWS4tuU-hcGI, index name: salesforce-3, Sync job id: oK9pMZMBAL1Ae5hBUPh1] operation index failed for doc a13b0000000jp6EAAQ, {'type': 'document_parsing_exception', 'reason': '[1:119] failed to parse: Limit of total fields [1000] has been exceeded while adding new fields [2]', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Limit of total fields [1000] has been exceeded while adding new fields [2]'}}
[FMWK][20:05:52][ERROR] [Connector id: x9RTLJMBuWS4tuU-hcGI, index name: salesforce-3, Sync job id: oK9pMZMBAL1Ae5hBUPh1] operation index failed for doc aLi4M0000004C93SAE, {'type': 'document_parsing_exception', 'reason': '[1:193] failed to parse: Limit of total fields [1000] has been exceeded while adding new fields [1]', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Limit of total fields [1000] has been exceeded while adding new fields [1]'}}
[FMWK][20:05:52][ERROR] [Connector id: x9RTLJMBuWS4tuU-hcGI, index name: salesforce-3, Sync job id: oK9pMZMBAL1Ae5hBUPh1] operation index failed for doc a2s61000000SRmrAAG, {'type': 'document_parsing_exception', 'reason': '[1:119] failed to parse: Limit of total fields [1000] has been exceeded while adding new fields [2]', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Limit of total fields [1000] has been exceeded while adding new fields [2]'}}
[FMWK][20:05:52][ERROR] [Connector id: x9RTLJMBuWS4tuU-hcGI, index name: salesforce-3, Sync job id: oK9pMZMBAL1Ae5hBUPh1] operation index failed for doc a1K8X00000G6UbIUAV, {'type': 'document_parsing_exception', 'reason': '[1:548] failed to parse: Limit of total fields [1000] has been exceeded while adding new fields [2]', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Limit of total fields [1000] has been exceeded while adding new fields [2]'}}

However, these errors don't cause the sync to fail, so customers not monitoring their logs might not realize that these documents got dropped during syncing, and may not realize that their index is perilously full of fields.

Related, this old attempt to port the "monitor" could be used to fix this, and detect if some threshold of _bulk errors is exceeded: #2671

To Reproduce

  1. create a salesforce connector
  2. don't set up any sync rules or customize mappings
  3. run a full sync
  4. watch the logs

Expected behavior

The sync should fail if we're dropping data, so that the end-user is aware that there's an issue

Environment

9.0.0-SNAPSHOT

Additional context

Related: https://github.com/elastic/sdh-search/issues/1510

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant