Skip to content

Releases: py-pdf/pypdf

Version 5.1.0, 2024-10-27

27 Oct 19:46
5.1.0
9f647e6
Compare
Choose a tag to compare

What's new

New Features (ENH)

  • Add layout_mode_font_height_weight argument to PageObject.extract_text() (#2920) by @hpierre001

Bug Fixes (BUG)

  • Fix font specificier for FreeText annotation (#2893) by @ssjkamei
  • Line breaks are not generated due to incorrect calculation of text leading (#2890) by @ssjkamei
  • Improve handling of spaces in text extraction (#2882) by @ssjkamei

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Code Style (STY)

Full Changelog

Version 5.0.1, 2024-09-29

29 Sep 09:55
ab21802
Compare
Choose a tag to compare

Version 5.0.1, 2024-09-29

New Features (ENH)

  • Add full parameter to PdfWriter constructor (#2865)

Bug Fixes (BUG)

  • Update pyproject.toml with minimum Python version of 3.8 (#2859)
  • Cope with unbalanced delimiters in dictionary object (#2878)
  • Cope with encoding with too many differences (#2873)
  • Missing spaces in extract_text() method (#1328) (#2868)
  • Tolerate truncated files and no warning when jumping startxref (#2855)

Robustness (ROB)

  • Repair PDF with invalid Root object (#2880)
  • Continue parsing dictionary object when error is detected (#2872)
  • Merge documents with invalid pages in named destinations (#2857)
  • Tolerate comments in arrays (#2856)

Developer Experience (DEV)

  • Use latest Python version for benchmarking (#2879)

Maintenance (MAINT)

  • Add tests to source distributions (#2874)
  • Refactor _update_field_annotation (#2862)

Full Changelog

Version 5.0.0, 2024-09-17

17 Sep 17:29
637bc44
Compare
Choose a tag to compare

Version 5.0.0, 2024-09-17

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).

Deprecations (DEP)

  • Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813)
  • Drop Python 3.7 support (#2793)

New Features (ENH)

  • Add capability to remove /Info from PDF (#2820)
  • Add incremental capability to PdfWriter (#2811)
  • Add UniGB-UTF16 encodings (#2819)
  • Accept utf strings for metadata (#2802)
  • Report PdfReadError instead of RecursionError (#2800)
  • Compress PDF files merging identical objects (#2795)

Bug Fixes (BUG)

  • Fix sheared image (#2801)

Robustness (ROB)

  • Robustify .set_data() (#2821)
  • Raise PdfReadError when missing /Root in trailer (#2808)
  • Fix extract_text() issues on damaged PDFs (#2760)
  • Handle images with empty data when processing an image from bytes (#2786)

Developer Experience (DEV)

  • Fix coverage uploads (#2832)
  • Test against Python 3.13 (#2776)

Full Changelog

Version 4.3.1, 2024-07-21

21 Jul 19:35
4.3.1
8f62120
Compare
Choose a tag to compare

Bug Fixes (BUG)

  • Cope with Matrix entry in field annotations (#2736)

Robustness (ROB)

  • Cope with fields with upside down box/rectangle (#2729)

Maintenance (MAINT)

  • Add deprecate_with_replacement to StreamObject.initializeFromD… (#2728)
  • Deal with cryptography>=43 moving ARC4 (#2765)

Full Changelog

Version 4.3.0, 2024-07-14

14 Jul 19:51
d3ef5e5
Compare
Choose a tag to compare

What's new

New Features (ENH)

Bug Fixes (BUG)

Documentation (DOC)

  • Various improvements on docstrings and examples by @j-t-1

Robustness (ROB)

Maintenance (MAINT)

  • Deprecate interiour_color with replacement interior_color (#2706) by @j-t-1
  • Add deprecate_with_replacement to PdfWriter.find_bookmark (#2674) by @j-t-1

Code Style (STY)

  • Change Link to be a non-markup annotation (#2714) by @j-t-1

Full Changelog

Version 4.2.0, 2024-04-07

07 Apr 15:38
4.2.0
2ac88e6
Compare
Choose a tag to compare

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Full Changelog

Version 4.1.0, 2024-03-03

03 Mar 11:50
4.1.0
6cf47c5
Compare
Choose a tag to compare

What's new

Generating name objects (NameObject) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a DeprecationWarning for now.

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Full Changelog

Version 4.0.2, 2024-02-18

18 Feb 15:45
4.0.2
cc306ad
Compare
Choose a tag to compare

What's new

Bug Fixes (BUG)

Documentation (DOC)

Developer Experience (DEV)

Testing (TST)

Full Changelog

Version 4.0.1, 2024-01-28

28 Jan 15:08
4.0.1
7579329
Compare
Choose a tag to compare

What's new

Bug Fixes (BUG)

Testing (TST)

Full Changelog

Version 4.0.0, 2024-01-19

19 Jan 13:28
4.0.0
26b9a97
Compare
Choose a tag to compare

What's new

pypdf==4.0.0 is a big milestone forward:

  • We finally have a layout-mode text extraction. This enables users who want to detect / extract tables with heuristics to give it a try.
  • We deprecated a lot of the old PyPDF2 API that was either not following PEP8 naming styles or was not using a property. Users coming from PyPDF2 might want to switch to pypdf<4.0.0 first to get helpful error messages that show the new API in their specific cases.

A big 'Thank you!' the the whole pypdf community for your work. Thanks to you, pypdf is better than ever.

Kudos to @shartzog who added the layout-mode with his first contribution!

Deprecations (DEP)

New Features (ENH)

Bug Fixes (BUG)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Code Style (STY)

Full Changelog