Releases · py-pdf/pypdf

27 Oct 19:46

github-actions

5.1.0

9f647e6

Version 5.1.0, 2024-10-27 Latest

Latest

What's new

New Features (ENH)

Add layout_mode_font_height_weight argument to PageObject.extract_text() (#2920) by @hpierre001

Bug Fixes (BUG)

Fix font specificier for FreeText annotation (#2893) by @ssjkamei
Line breaks are not generated due to incorrect calculation of text leading (#2890) by @ssjkamei
Improve handling of spaces in text extraction (#2882) by @ssjkamei

Robustness (ROB)

Soft failure for flate encode image mode 1 with wrong LUT size (#2900) by @stefan6419846

Documentation (DOC)

Use latest package versions (#2907) by @stefan6419846
Correct example of reading FileAttachment annotation (#2906) by @j-t-1

Developer Experience (DEV)

Update pinned requirements (#2918) by @stefan6419846
Make make_release.py compatible with Windows environment (#2894) by @pubpub-zz

Maintenance (MAINT)

Remove references to outdated Python versions (#2919) by @stefan6419846
Generalize the method of obtaining space_code (#2891) by @ssjkamei
Unnecessary character mapping process (#2888) by @ssjkamei
New LZW decoding implementation (#2887) by @MartinThoma

Testing (TST)

Add LzwCodec for encoding (#2883) by @MartinThoma

Code Style (STY)

Capitalize error messages (#2903) by @j-t-1
Modify error messages in PdfWriter (#2902) by @j-t-1

Full Changelog

Contributors

MartinThoma, pubpub-zz, and 4 other contributors

Assets 2

29 Sep 09:55

pubpub-zz

5.0.1

ab21802

Version 5.0.1, 2024-09-29

New Features (ENH)

Add full parameter to PdfWriter constructor (#2865)

Bug Fixes (BUG)

Update pyproject.toml with minimum Python version of 3.8 (#2859)
Cope with unbalanced delimiters in dictionary object (#2878)
Cope with encoding with too many differences (#2873)
Missing spaces in extract_text() method (#1328) (#2868)
Tolerate truncated files and no warning when jumping startxref (#2855)

Robustness (ROB)

Repair PDF with invalid Root object (#2880)
Continue parsing dictionary object when error is detected (#2872)
Merge documents with invalid pages in named destinations (#2857)
Tolerate comments in arrays (#2856)

Developer Experience (DEV)

Use latest Python version for benchmarking (#2879)

Maintenance (MAINT)

Add tests to source distributions (#2874)
Refactor _update_field_annotation (#2862)

Full Changelog

Assets 2

17 Sep 17:29

pubpub-zz

5.0.0

637bc44

Version 5.0.0, 2024-09-17

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).

Deprecations (DEP)

Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813)
Drop Python 3.7 support (#2793)

New Features (ENH)

Add capability to remove /Info from PDF (#2820)
Add incremental capability to PdfWriter (#2811)
Add UniGB-UTF16 encodings (#2819)
Accept utf strings for metadata (#2802)
Report PdfReadError instead of RecursionError (#2800)
Compress PDF files merging identical objects (#2795)

Bug Fixes (BUG)

Fix sheared image (#2801)

Robustness (ROB)

Robustify .set_data() (#2821)
Raise PdfReadError when missing /Root in trailer (#2808)
Fix extract_text() issues on damaged PDFs (#2760)
Handle images with empty data when processing an image from bytes (#2786)

Developer Experience (DEV)

Fix coverage uploads (#2832)
Test against Python 3.13 (#2776)

Full Changelog

Assets 2

21 Jul 19:35

github-actions

4.3.1

8f62120

Version 4.3.1, 2024-07-21

Bug Fixes (BUG)

Cope with Matrix entry in field annotations (#2736)

Robustness (ROB)

Cope with fields with upside down box/rectangle (#2729)

Maintenance (MAINT)

Add deprecate_with_replacement to StreamObject.initializeFromD… (#2728)
Deal with cryptography>=43 moving ARC4 (#2765)

Full Changelog

Assets 2

14 Jul 19:51

github-actions

4.3.0

d3ef5e5

Version 4.3.0, 2024-07-14

What's new

New Features (ENH)

Accept ETen-B5 and UniCNS-UTF16 encodings (#2721) by @pubpub-zz
Add decode_as_image() to ContentStreams (#2615) by @pubpub-zz
context manager for PdfReader (#2666) by @tibor-reiss
Add capability to set font and size in fields (#2636) by @pubpub-zz
Allow to pass input file without named argument (#2576) by @pubpub-zz

Bug Fixes (BUG)

Fix deprecation for Ressources when using old constants (#2705) by @stefan6419846
Fix images issue 4 bits encoding and LUT starting with UTF16_BOM (#2675) by @pubpub-zz
Reading large compressed images takes huge time to process (#2644) by @snanda85
Highlighted Text Cannot Be Printed (#2604) by @Nifury
Fix UnboundLocalError on malformed pdf (#2619) by @farjasju

Documentation (DOC)

Various improvements on docstrings and examples by @j-t-1

Robustness (ROB)

Cope with missing Standard 14 fonts in fields (#2677) by @pubpub-zz
Improve inline image extraction (#2622) by @pubpub-zz
Cope with loops in Fields tree (#2656) by @pubpub-zz
Discard /I in choice fields for compatibility with Acrobat (#2614) by @pubpub-zz
Cope with some issues in pillow (#2595) by @pubpub-zz
Cope with some image extraction issues (#2591) by @pubpub-zz

Maintenance (MAINT)

Deprecate interiour_color with replacement interior_color (#2706) by @j-t-1
Add deprecate_with_replacement to PdfWriter.find_bookmark (#2674) by @j-t-1

Code Style (STY)

Change Link to be a non-markup annotation (#2714) by @j-t-1

Full Changelog

Contributors

snanda85, pubpub-zz, and 5 other contributors

Assets 2

07 Apr 15:38

stefan6419846

4.2.0

2ac88e6

Version 4.2.0, 2024-04-07

What's new

New Features (ENH)

Allow multiple charsets for NameObject.read_from_stream (#2585) by @pubpub-zz
Add support for /Kids in page labels (#2562) by @stefan6419846
Allow to update fields on many pages (#2571) by @pubpub-zz
Tolerate PDF with invalid xref pointed objects (#2335) by @pubpub-zz
Add Enforce from PDF2.0 in viewer_preferences (#2511) by @pubpub-zz
Add += and -= operators to ArrayObject (#2510) by @pubpub-zz

Bug Fixes (BUG)

Fix merge_page sometimes generating unknown operator 'QQ' (#2588) by @rfotino
Fix fields update where annotations are kids of field (#2570) by @pubpub-zz
Process CMYK images without a filter correctly (#2557) by @pubpub-zz
Extract text in layout mode without finding resources (#2555) by @pubpub-zz
Prevent recursive loop in some PDF files (#2505) by @pubpub-zz

Robustness (ROB)

Tolerate "truncated" xref (#2580) by @pubpub-zz
Replace error by warning for EOD in RunLengthDecode/ASCIIHexDecode (#2334) by @pubpub-zz
Rebuild xref table if one entry is invalid (#2528) by @pubpub-zz
Robustify stream extraction (#2526) by @pubpub-zz

Documentation (DOC)

Update release process for latest changes (#2564) by @stefan6419846
Encryption/decryption: Clone document instead of copying all pages (#2546) by @redfast00
Minor improvements (#2542) by @j-t-1
Update annotation list (#2534) by @j-t-1
Update references and formatting (#2529) by @j-t-1
Correct threads reference, plus minor changes (#2521) by @j-t-1
Minor readability increases (#2515) by @j-t-1
Simplify PaperSize examples (#2504) by @j-t-1
Minor improvements (#2501) by @j-t-1

Developer Experience (DEV)

Remove unused dependencies (#2572) by @stefan6419846
Remove page labels PR link from message (#2561) by @stefan6419846
Fix changelog generator regarding whitespace and handling of "Other" group (#2492) by @stefan6419846
Add REL to known PR prefixes (#2554) by @stefan6419846
Release using the REL commit instead of git tag (#2500) by @MartinThoma
Unify code between PdfReader and PdfWriter (#2497) by @pubpub-zz
Bump softprops/action-gh-release from 1 to 2 (#2514) by @dependabot[bot]

Maintenance (MAINT)

Ressources → Resources (and internal name childs) (#2550) by @pubpub-zz
Fix typos found by codespell (#2549) by @stefan6419846
Update Read the Docs configuration (#2538) by @j-t-1
Add root_object, _info and _ID to PdfReader (#2495) by @pubpub-zz

Testing (TST)

Allow loading truncated images if required (#2586) by @stefan6419846
Fix download issues from #2562 (#2578) by @pubpub-zz
Improve test_get_contents_from_nullobject to show real use-case (#2524) by @stefan6419846
Add missing test annotations (#2507) by @stefan6419846

Full Changelog

Contributors

MartinThoma, pubpub-zz, and 5 other contributors

Assets 2

03 Mar 11:50

github-actions

4.1.0

6cf47c5

Version 4.1.0, 2024-03-03

What's new

Generating name objects (NameObject) without a leading slash is considered deprecated now. Previously, just a plain warning would be logged, leading to possibly invalid PDF files. According to our deprecation policy, this will log a DeprecationWarning for now.

New Features (ENH)

Add get_pages_from_field (#2494) by @pubpub-zz
Add reattach_fields function (#2480) by @pubpub-zz
Automatic access to pointed object for IndirectObject (#2464) by @pubpub-zz

Bug Fixes (BUG)

missing error on name without leading / (#2387) by @Rak424
encode_pdfdocencoding() always returns bytes (#2440) by @sbourlon
BI in text content identified as image tag (#2459) by @pubpub-zz

Robustness (ROB)

Missing basefont entry in type 3 font (#2469) by @pubpub-zz

Documentation (DOC)

Amend robustness documentation (#2479) by @j-t-1

Developer Experience (DEV)

Fix changelog for UTF-8 characters (#2462) by @stefan6419846

Maintenance (MAINT)

Add _get_page_number_from_indirect in writer (#2493) by @pubpub-zz
Remove user assignment for feature requests (#2483) by @stefan6419846
Remove reference to old 2.0.0 branch (#2482) by @stefan6419846

Testing (TST)

Fix benchmark failures (#2481) by @stefan6419846
Resolve file naming conflict in test_iss1767 (#2445) by @sbourlon

Full Changelog

Contributors

pubpub-zz, sbourlon, and 3 other contributors

Assets 2

18 Feb 15:45

github-actions

4.0.2

cc306ad

Version 4.0.2, 2024-02-18

What's new

Bug Fixes (BUG)

Use NumberObject for /Border elements of annotations (#2451) by @rsinger417

Documentation (DOC)

Document easier way to update metadata (#2454) by @stefan6419846
Typo Polyline \xe2\x86\x92 PolyLine in adding-pdf-annotations.md (#2426) by @CWKSC

Developer Experience (DEV)

Bump codecov/codecov-action from 3 to 4 (#2430) by @dependabot[bot]

Testing (TST)

Avoid catching not emitted warnings (#2429) by @stefan6419846

Full Changelog

Contributors

dependabot, CWKSC, and 2 other contributors

Assets 2

28 Jan 15:08

github-actions

4.0.1

7579329

Version 4.0.1, 2024-01-28

What's new

Bug Fixes (BUG)

layout mode text extraction ZeroDivisionError (#2417) by @shartzog

Testing (TST)

Skip tests using fpdf2 if it's not installed (#2419) by @MartinThoma

Full Changelog

Contributors

MartinThoma and shartzog

Assets 2

19 Jan 13:28

github-actions

4.0.0

26b9a97

Version 4.0.0, 2024-01-19

What's new

pypdf==4.0.0 is a big milestone forward:

We finally have a layout-mode text extraction. This enables users who want to detect / extract tables with heuristics to give it a try.
We deprecated a lot of the old PyPDF2 API that was either not following PEP8 naming styles or was not using a property. Users coming from PyPDF2 might want to switch to pypdf<4.0.0 first to get helpful error messages that show the new API in their specific cases.

A big 'Thank you!' the the whole pypdf community for your work. Thanks to you, pypdf is better than ever.

Kudos to @shartzog who added the layout-mode with his first contribution!

Deprecations (DEP)

Drop Python 3.6 support (#2369) by @MartinThoma
Remove deprecated code (#2367) by @MartinThoma
Remove deprecated XMP properties (#2386) by @stefan6419846

New Features (ENH)

Add "layout" mode for text extraction (#2388) by @shartzog
Add Jupyter Notebook integration for PdfReader (#2375) by @MartinThoma
Improve/rewrite PDF permission retrieval (#2400) by @stefan6419846

Bug Fixes (BUG)

PdfWriter.add_uri was setting the wrong type (#2406) by @pmiller66
Add support for GBK2K cmaps (#2385) by @stefan6419846

Documentation (DOC)

Add pmiller66 for #2406 as a contributor by @MartinThoma
Add missing expand parameter (#2393) by @Atomnp
Resolve build warnings (#2380) by @stefan6419846
Fix testing prerequisites (#2381) by @stefan6419846
Improve formatting of contributors page (#2383) by @stefan6419846
Add Tobeabellwether as a contributor for #2341 by @MartinThoma

Developer Experience (DEV)

Make dependabot aware of our PR prefixes (#2415) by @stefan6419846
Fail on Sphinx issues (#2405) by @stefan6419846
Move title check to own workflow (#2384) by @MasterOdin
Write to temporary files instead of the working directory (#2379) by @stefan6419846
Ensure that the PR titles have the correct format (#2378) by @stefan6419846

Maintenance (MAINT)

Return None instead of -1 when page is not attached (#2376) by @MartinThoma
Complete FileSpecificationDictionaryEntries constants (#2416) by @MartinThoma
Replace warning with logging.error (#2377) by @MartinThoma

Testing (TST)

Add missing pytest.mark.samples annotations (#2412) by @kitterma
Correctly close temporary files (#2396) by @stefan6419846
Fix side effect #2379 (#2395) by @pubpub-zz
Add test for layout extraction mode (#2390) by @MartinThoma

Code Style (STY)

Use the UserAccessPermissions enum (#2398) by @MartinThoma
Run black (#2370) by @MartinThoma

Full Changelog

Contributors

MartinThoma, MasterOdin, and 6 other contributors

Assets 2

Releases: py-pdf/pypdf

Version 5.1.0, 2024-10-27

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Code Style (STY)

Contributors

Version 5.0.1, 2024-09-29

Version 5.0.1, 2024-09-29

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Developer Experience (DEV)

Maintenance (MAINT)

Version 5.0.0, 2024-09-17

Version 5.0.0, 2024-09-17

Deprecations (DEP)

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Developer Experience (DEV)

Version 4.3.1, 2024-07-21

Bug Fixes (BUG)

Robustness (ROB)

Maintenance (MAINT)

Version 4.3.0, 2024-07-14

What's new

New Features (ENH)

Bug Fixes (BUG)

Documentation (DOC)

Robustness (ROB)

Maintenance (MAINT)

Code Style (STY)

Contributors

Version 4.2.0, 2024-04-07

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Contributors

Version 4.1.0, 2024-03-03

What's new

New Features (ENH)

Bug Fixes (BUG)

Robustness (ROB)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)

Contributors

Version 4.0.2, 2024-02-18

What's new

Bug Fixes (BUG)

Documentation (DOC)

Developer Experience (DEV)

Testing (TST)

Contributors

Version 4.0.1, 2024-01-28

What's new

Bug Fixes (BUG)

Testing (TST)

Contributors

Version 4.0.0, 2024-01-19

What's new

Deprecations (DEP)

New Features (ENH)

Bug Fixes (BUG)

Documentation (DOC)

Developer Experience (DEV)

Maintenance (MAINT)

Testing (TST)