Skip to content

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Aug 27, 2025

Bumps unstructured from 0.10.27 to 0.18.14.

Release notes

Sourced from unstructured's releases.

0.18.14

Enhancements

  • Speed up function sentence_count by 59% (codeflash)

  • Speed up function check_for_nltk_package by 111% (codeflash)

  • Speed up function under_non_alpha_ratio by 76% (codeflash)

Features

Fixes

0.18.13

Fixes

Parse a wider variety of date formats in email headers The partition_email function is now more robust to non-standard date formats, including ISO-8601 dates with "Z" suffixes. This prevents ValueError exceptions when partitioning emails with these date formats.

0.18.12

What's Changed

  • Prevent large file content in encoding exceptions Replace UnicodeDecodeError with UnprocessableEntityError in encoding detection to avoid storing entire file content in exception objects, which can cause issues in logging and error reporting systems when processing large files.

Full Changelog: Unstructured-IO/unstructured@0.18.11...0.18.12

0.18.11

What's Changed

Full Changelog: Unstructured-IO/unstructured@0.18.10...0.18.11

0.18.10

Enhancements

Features

  • Add OCR_AGENT_CACHE_SIZE environment variable Added configurable cache size for OCR agents to control memory usage.

... (truncated)

Changelog

Sourced from unstructured's changelog.

0.18.14

Enhancements

  • Speed up function sentence_count by 59% (codeflash)

  • Speed up function check_for_nltk_package by 111% (codeflash)

  • Speed up function under_non_alpha_ratio by 76% (codeflash)

Features

Fixes

0.18.13

Enhancements

Features

Fixes

  • Parse a wider variety of date formats in email headers The partition_email function is now more robust to non-standard date formats, including ISO-8601 dates with "Z" suffixes. This prevents ValueError exceptions when partitioning emails with these date formats.

0.18.12

Enhancements

Features

Fixes

  • Prevent large file content in encoding exceptions Replace UnicodeDecodeError with UnprocessableEntityError in encoding detection to avoid storing entire file content in exception objects, which can cause issues in logging and error reporting systems when processing large files.

0.18.11

Enhancements

  • Standardized on charset-normalizer library for encoding detection Previously we had both chardet and charset-normalizer as dependencies. We are dropping chardet and only using charset-normalizer.

Features

  • Type-aware <input> mapping in HTML transformations Bare <input> elements are now classified by their type attribute (checkbox → Checkbox, radio → RadioButton, others → FormFieldValue).

... (truncated)

Commits
  • fed8942 manual fix for open CVEs (#4085)
  • 51425dd ⚡️ Speed up function sentence_count by 59% (#4080)
  • 57cadf8 ⚡️ Speed up function check_for_nltk_package by 111% (#4081)
  • cc635c9 ⚡️ Speed up function under_non_alpha_ratio by 76% (#4079)
  • 76d7a5c Chore/change lang detection logging level to avoid warning log spamming (#4078)
  • 0d20f6a email date format flexibility (#4072)
  • b8c14a7 fix: replace UnicodeDecodeError to prevent large payload logging (#4071)
  • 591729c bump version and release (#4070)
  • d83df42 chore: switch to charset normalizer (#4060)
  • 5368197 feat: map <input> tags by type + add coverage (#4068)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot dependabot bot added the chore label Aug 27, 2025
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Aug 27, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@dependabot/pip/unstructured-0.18.14#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch dependabot/pip/unstructured-0.18.14

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

@dependabot dependabot bot force-pushed the dependabot/pip/unstructured-0.18.14 branch 4 times, most recently from d4733b3 to f363879 Compare August 29, 2025 22:56
@dependabot dependabot bot force-pushed the dependabot/pip/unstructured-0.18.14 branch from f363879 to 9d6e594 Compare September 10, 2025 18:40
Bumps [unstructured](https://github.com/Unstructured-IO/unstructured) from 0.10.27 to 0.18.14.
- [Release notes](https://github.com/Unstructured-IO/unstructured/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured/blob/main/CHANGELOG.md)
- [Commits](Unstructured-IO/unstructured@0.10.27...unstructured_0.18.14)

---
updated-dependencies:
- dependency-name: unstructured
  dependency-version: 0.18.14
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot force-pushed the dependabot/pip/unstructured-0.18.14 branch from 9d6e594 to 99f251f Compare September 17, 2025 07:12
Copy link
Contributor Author

dependabot bot commented on behalf of github Sep 17, 2025

Superseded by #767.

@dependabot dependabot bot closed this Sep 17, 2025
@dependabot dependabot bot deleted the dependabot/pip/unstructured-0.18.14 branch September 17, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants