[BUG] Panic on extracting images #566

crashingfish · 2024-09-12T01:38:28Z

Description

Panic is triggered when ExtractPageImages is called. This does not happen all the time, but happens rarely.

Expected Behavior

ExtractPageImages should not panic. Should either return error or page images.

Actual Behavior

Call ExtractPageImages with empty ImageExtractOption

Attachments

Cannot share pdf images due to compliance issues. Code to reproduce issue:

page, _ := pdfReader.GetPage(1) pextract, _ := extractor.New(page) pimages, err := pextract.ExtractPageImages(&extractor.ImageExtractOptions{})

Stack trace

`
panic: runtime error: index out of range [1256] with length 1032

goroutine 8 [running]:
github.com/unidoc/unipdf/v3/internal/jbig2/document/segments.(*TextRegion).decodeIb(0x14002156000, 0x0?, 0x4e8)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/segments/segments.go:320 +0x734
github.com/unidoc/unipdf/v3/internal/jbig2/document/segments.(*TextRegion).decodeSymbolInstances(0x14002156000)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/segments/segments.go:77 +0x1dc
github.com/unidoc/unipdf/v3/internal/jbig2/document/segments.(*TextRegion).GetRegionBitmap(0x14002156000)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/segments/segments.go:283 +0x13c
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).createNormalPage(0x140003b29a0, 0x140003623c0)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:79 +0x220
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).createPage(0x14000430280?, 0x1?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:70 +0x38
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).composePageBitmap(0x140003b29a0)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:115 +0x4c
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).GetBitmap(0x140003b29a0)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:112 +0xe8
github.com/unidoc/unipdf/v3/internal/jbig2/decoder.(*Decoder).decodePage(0x140007a10e0, 0x30e9?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/decoder/decoder.go:21 +0x1d0
github.com/unidoc/unipdf/v3/internal/jbig2/decoder.(*Decoder).DecodeNextPage(...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/decoder/decoder.go:19
github.com/unidoc/unipdf/v3/internal/jbig2.DecodeBytes({0x140003d5500, 0x30e9, 0x30e9}, {0x0?, 0x14000018c18?}, {0x14000018c10?, 0x10?, 0x103462140?})
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/jbig2.go:16 +0x98
github.com/unidoc/unipdf/v3/core.(*JBIG2Encoder).DecodeBytes(...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/core/core.go:1345
github.com/unidoc/unipdf/v3/core.(*JBIG2Encoder).DecodeStream(0x140000be370?, 0xf?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/core/core.go:1732 +0x50
github.com/unidoc/unipdf/v3/core.DecodeStream(0x140000b0000?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/core/core.go:91 +0x198
github.com/unidoc/unipdf/v3/model.(*XObjectImage).ToImage(0x14000244580)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/model/model.go:4153 +0x130
github.com/unidoc/unipdf/v3/extractor._aegba({0x10357f018?, 0x140000be370?}, {0x10357ba68, 0x14002154c74})
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:508 +0x3c
github.com/unidoc/unipdf/v3/extractor.(*imageExtractContext).extractXObjectImage(0x1400007ee60, 0x1400040a7e0, {{0x103586360, 0x103dcd060}, {0x103586360, 0x103dcd060}, {0x1034ae220, 0x1400086a1d8}, {0x1034ae220, 0x1400086a1e0}, ...}, ...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:788 +0x188
github.com/unidoc/unipdf/v3/extractor.(*imageExtractContext).processOperand(0x1400007ee60, 0x14000019268?, {{0x103586360, 0x103dcd060}, {0x103586360, 0x103dcd060}, {0x1034ae220, 0x1400086a1d8}, {0x1034ae220, 0x1400086a1e0}, ...}, ...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:125 +0x20c
github.com/unidoc/unipdf/v3/contentstream.(*ContentStreamProcessor).Process(0x14000019670, 0x3c?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/contentstream/contentstream.go:623 +0xa84
github.com/unidoc/unipdf/v3/extractor.(*imageExtractContext).extractContentStreamImages(0x1400007ee60, {0x14000414100?, 0x1?}, 0x1034aed20?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:912 +0x208
github.com/unidoc/unipdf/v3/extractor.(*Extractor).ExtractPageImages(0x14000430000, 0x1400086a038)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:223 +0x68
`

The text was updated successfully, but these errors were encountered:

3ace · 2024-09-12T04:06:48Z

@crashingfish thanks for reporting this. Is it possible for you to attach a sample PDF document where this issue is happening?

crashingfish · 2024-09-12T08:30:18Z

@3ace Sorry. I would have already if it was allowed. However happy to provide any other information about the pdf that may help you reconstruct a similar file at your end.

3ace · 2024-09-12T08:36:08Z

@crashingfish so far what we could gather is the PDF file contains an image file encoded using JBIG2 encoding based on the log you provided, but we couldn't replicate the issue as of now.

Is it possible for you to extract the image using the same encoding and send it to us so that we might be able to reconstruct similar file?

crashingfish · 2024-09-14T14:42:45Z

@3ace I will need to check what that image contains and if it is allowed to be shared. Will get back.

crashingfish · 2024-09-14T15:06:14Z

@3ace I am still checking on this. Meanwhile, would you be able to share your official email id?

3ace · 2024-09-14T15:54:58Z

@crashingfish you can reach us trough [email protected]

3ace · 2024-09-26T12:21:08Z

@crashingfish Hi, we haven’t received an email from you. Could you please confirm if you’ve sent it to us?

3ace added the bug Something isn't working label Sep 12, 2024

unidoc deleted a comment Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Panic on extracting images #566

[BUG] Panic on extracting images #566

crashingfish commented Sep 12, 2024

3ace commented Sep 12, 2024

crashingfish commented Sep 12, 2024

3ace commented Sep 12, 2024 •

edited

Loading

crashingfish commented Sep 14, 2024

crashingfish commented Sep 14, 2024

3ace commented Sep 14, 2024

3ace commented Sep 26, 2024

[BUG] Panic on extracting images #566

[BUG] Panic on extracting images #566

Comments

crashingfish commented Sep 12, 2024

Description

Expected Behavior

Actual Behavior

Attachments

Stack trace

3ace commented Sep 12, 2024

crashingfish commented Sep 12, 2024

3ace commented Sep 12, 2024 • edited Loading

crashingfish commented Sep 14, 2024

crashingfish commented Sep 14, 2024

3ace commented Sep 14, 2024

3ace commented Sep 26, 2024

3ace commented Sep 12, 2024 •

edited

Loading