Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outlook Connector Crashes when an SMTP address does not have a mailbox associated #2931

Open
josephschultz-expedient opened this issue Oct 30, 2024 · 4 comments · Fixed by #2967
Labels
bug Something isn't working community-driven

Comments

@josephschultz-expedient

Bug Description

While deploying the Outlook Connector for Exchange Online, I found that the process will crash about half way through the sync operation with this error:

TransportError: No valid version headers found in response (ErrorNonExistentMailbox('The SMTP address has no mailbox associated with it.'))

This is a small dev environment with only about 15 users and the only SMTP objects without mailboxes would be a Microsoft 365 Group Email.

This environment was configured as per the Elastic documentation. The only caveat would be that I had to add the Microsoft Graph User.Read.All permission in order to retrieve the user list. Otherwise it returned a 403 error

aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url='https://graph.microsoft.com/v1.0/users?$top=999'

To Reproduce

Steps to reproduce the behavior:

  1. Verify that the Exchange Online environment has at least one Microsoft 365 Group Email created
  2. When creating the new App in Azure, assign both the Exchange Online full_access_as_app and Microsoft Graph User.Read.All permissions
  3. Deploy as per the documentation

Expected behavior

Unless there is a different approach of retrieving the SMTP or user listing, the ideal behavior would be skipping SMTP addresses without mailboxes associated.

Environment

  • Elasticsearch 8.15.1
  • 8.15.1 Elastic Connector running within a docker 26.0 environment
  • Office365 / Exchange Online

Error Logs

[FMWK][18:26:01][ERROR] [Connector id: VOzQ1JIB5RYrKcWA6hjz, index name: client-outlook-connector, Sync job id: zdas3pIBsfHVd-ovtKc9] Extractor failed with an error: No valid version headers found in response (ErrorNonExistentMailbox('The SMTP address has no mailbox associated with it.'))
[FMWK][18:26:01][CRITICAL] [Connector id: VOzQ1JIB5RYrKcWA6hjz, index name: client-outlook-connector, Sync job id: zdas3pIBsfHVd-ovtKc9] Document extractor failed
Traceback (most recent call last):
  File "/app/lib/python3.10/site-packages/exchangelib/version.py", line 202, in guess
    list(ConvertId(protocol=protocol).call([AlternateId(id="DUMMY", format=EWS_ID, mailbox="DUMMY")], ENTRY_ID))
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 216, in _elems_to_objs
    for elem in elems:
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 278, in _chunked_get_elements
    yield from self._get_elements(payload=payload_func(chunk, **kwargs))
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 299, in _get_elements
    yield from self._response_generator(payload=payload)
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 262, in _response_generator
    response = self._get_response_xml(payload=payload)
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 408, in _get_response_xml
    return self._get_soap_messages(body=body, **parse_opts)
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 496, in _get_soap_messages
    self._raise_soap_errors(fault=fault)  # Will throw SOAPError or custom EWS error
  File "/app/lib/python3.10/site-packages/exchangelib/services/common.py", line 536, in _raise_soap_errors
    raise vars(errors)[code](msg)
exchangelib.errors.ErrorNonExistentMailbox: The SMTP address has no mailbox associated with it.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/connectors/es/sink.py", line 492, in run
    await self.get_docs(generator, skip_unchanged_documents=True)
  File "/app/connectors/es/sink.py", line 541, in get_docs
    async for count, doc in aenumerate(generator):
  File "/app/connectors/utils.py", line 856, in aenumerate
    async for elem in asequence:
  File "/app/connectors/logger.py", line 247, in __anext__
    return await self.gen.__anext__()
  File "/app/connectors/es/sink.py", line 523, in _decorate_with_metrics_span
    async for doc in generator:
  File "/app/connectors/sync_job_runner.py", line 454, in prepare_docs
    async for doc, lazy_download, operation in self.generator():
  File "/app/connectors/sync_job_runner.py", line 505, in generator
    async for doc, lazy_download in self.data_provider.get_docs(
  File "/app/connectors/sources/outlook.py", line 1057, in get_docs
    async for account in self.client._get_user_instance.get_user_accounts():
  File "/app/connectors/sources/outlook.py", line 450, in get_user_accounts
    user_account = Account(
  File "/app/lib/python3.10/site-packages/exchangelib/account.py", line 205, in __init__
    self.version = self.protocol.version.copy()
  File "/app/lib/python3.10/site-packages/exchangelib/protocol.py", line 480, in version
    self.config.version = Version.guess(self, api_version_hint=self.api_version_hint)
  File "/app/lib/python3.10/site-packages/exchangelib/version.py", line 206, in guess
    raise TransportError(f"No valid version headers found in response ({e!r})")
exchangelib.errors.TransportError: No valid version headers found in response (ErrorNonExistentMailbox('The SMTP address has no mailbox associated with it.'))
@seanstory
Copy link
Member

Hi, @josephschultz-expedient! Thanks for filing.

I think #2967 will fix this. Can you give that branch/diff a try, and see if it resolves your issue?

@eddxavier-elastic
Copy link

eddxavier-elastic commented Dec 11, 2024

Hi team (@seanstory @navarone-feekery), I'm experiencing the same behavior with a Managed connector from a 8.16.0 deployment, for outlook_server

enterprise_search.connectors][critical] Document extractor failed
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/es/sink.py", line 490, in run
    await self.get_docs(generator)
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/es/sink.py", line 542, in get_docs
    async for count, doc in aenumerate(generator):
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/utils.py", line 856, in aenumerate
    async for elem in asequence:
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/logger.py", line 244, in __anext__
    return await self.gen.__anext__()
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/es/sink.py", line 524, in _decorate_with_metrics_span
    async for doc in generator:
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sync_job_runner.py", line 457, in prepare_docs
    async for doc, lazy_download, operation in self.generator():
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sync_job_runner.py", line 493, in generator
    async for doc, lazy_download in self.data_provider.get_docs(
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sources/outlook.py", line 1074, in get_docs
    async for account in self.client._get_user_instance.get_user_accounts():
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/connectors/sources/outlook.py", line 343, in get_user_accounts
    user_account = Account(
  File "/usr/share/enterprise-search/lib/python3.10/site-packages/exchangelib/account.py", line 137, in __init__
    raise ValueError(f"primary_smtp_address {primary_smtp_address!r} is not an email address")

This is sample Exchange deployment with very few test users and all have primarySMTPaddress set:

[PS] C:\Windows\system32>Get-Mailbox | Select-Object DisplayName, PrimarySmtpAddress

DisplayName              PrimarySmtpAddress
-----------              ------------------
eduardo_xavier           eduardo_xavier@edu.lab
exavier                  exavier@edu.lab
Discovery Search Mailbox DiscoverySearchMailbox{D919BA05-46A6-415f-80AD-7E09334BB852}@edu.lab
User1                    user1@edu.lab
User2                    user2@edu.lab
User7                    User7@edu.lab
User8                    User8@edu.lab
Pele                     pele@edu.lab
Zico                     zico@edu.lab
Ronaldo                  ronaldo@edu.lab
Romario                  romario@edu.lab
Rivaldo                  rivaldo@edu.lab
Ronaldinho               ronaldinho@edu.lab
Kaka                     kaka@edu.lab
Neymar                   neymar@edu.lab
Garrincha                garrincha@edu.lab
Socrates                 socrates@edu.lab
Navarone Feekery         nfeekery@edu.lab

Maybe here we can check if the user has the attribute "msExchHomeServerName" set, before collecting their "mail", something like this (for outlook_server only since this is an onprem AD based deploy):

    if user.get("attributes", {}).get("msExchHomeServerName") is not None:
  ## get user mail attribute 

@seanstory seanstory reopened this Dec 11, 2024
@seanstory
Copy link
Member

This error is slightly different than the original one, but I think both are showing:

  1. we need to have better logging in place to help identify these edge case
  2. we need to catch errors for specific users and move past them, rather than crashing on them.

@eddxavier-elastic
Copy link

In this particular case, noting the user, we could go and just add the missing attribute, but I can't tell which user has the issue, at least for the ones that are enabled, but in prod environments we might see enabled users that for any reason don't have a mailbox and therefore might lack the mail attribute.

In any case, I can repro this all day long, let me know if can help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community-driven
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants