Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Panic on extracting images #566

Open
crashingfish opened this issue Sep 12, 2024 · 7 comments
Open

[BUG] Panic on extracting images #566

crashingfish opened this issue Sep 12, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@crashingfish
Copy link

Description

Panic is triggered when ExtractPageImages is called. This does not happen all the time, but happens rarely.

Expected Behavior

ExtractPageImages should not panic. Should either return error or page images.

Actual Behavior

Call ExtractPageImages with empty ImageExtractOption

Attachments

Cannot share pdf images due to compliance issues. Code to reproduce issue:

page, _ := pdfReader.GetPage(1) pextract, _ := extractor.New(page) pimages, err := pextract.ExtractPageImages(&extractor.ImageExtractOptions{})

Stack trace

`
panic: runtime error: index out of range [1256] with length 1032

goroutine 8 [running]:
github.com/unidoc/unipdf/v3/internal/jbig2/document/segments.(*TextRegion).decodeIb(0x14002156000, 0x0?, 0x4e8)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/segments/segments.go:320 +0x734
github.com/unidoc/unipdf/v3/internal/jbig2/document/segments.(*TextRegion).decodeSymbolInstances(0x14002156000)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/segments/segments.go:77 +0x1dc
github.com/unidoc/unipdf/v3/internal/jbig2/document/segments.(*TextRegion).GetRegionBitmap(0x14002156000)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/segments/segments.go:283 +0x13c
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).createNormalPage(0x140003b29a0, 0x140003623c0)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:79 +0x220
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).createPage(0x14000430280?, 0x1?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:70 +0x38
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).composePageBitmap(0x140003b29a0)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:115 +0x4c
github.com/unidoc/unipdf/v3/internal/jbig2/document.(*Page).GetBitmap(0x140003b29a0)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/document/document.go:112 +0xe8
github.com/unidoc/unipdf/v3/internal/jbig2/decoder.(*Decoder).decodePage(0x140007a10e0, 0x30e9?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/decoder/decoder.go:21 +0x1d0
github.com/unidoc/unipdf/v3/internal/jbig2/decoder.(*Decoder).DecodeNextPage(...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/decoder/decoder.go:19
github.com/unidoc/unipdf/v3/internal/jbig2.DecodeBytes({0x140003d5500, 0x30e9, 0x30e9}, {0x0?, 0x14000018c18?}, {0x14000018c10?, 0x10?, 0x103462140?})
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/internal/jbig2/jbig2.go:16 +0x98
github.com/unidoc/unipdf/v3/core.(*JBIG2Encoder).DecodeBytes(...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/core/core.go:1345
github.com/unidoc/unipdf/v3/core.(*JBIG2Encoder).DecodeStream(0x140000be370?, 0xf?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/core/core.go:1732 +0x50
github.com/unidoc/unipdf/v3/core.DecodeStream(0x140000b0000?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/core/core.go:91 +0x198
github.com/unidoc/unipdf/v3/model.(*XObjectImage).ToImage(0x14000244580)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/model/model.go:4153 +0x130
github.com/unidoc/unipdf/v3/extractor._aegba({0x10357f018?, 0x140000be370?}, {0x10357ba68, 0x14002154c74})
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:508 +0x3c
github.com/unidoc/unipdf/v3/extractor.(*imageExtractContext).extractXObjectImage(0x1400007ee60, 0x1400040a7e0, {{0x103586360, 0x103dcd060}, {0x103586360, 0x103dcd060}, {0x1034ae220, 0x1400086a1d8}, {0x1034ae220, 0x1400086a1e0}, ...}, ...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:788 +0x188
github.com/unidoc/unipdf/v3/extractor.(*imageExtractContext).processOperand(0x1400007ee60, 0x14000019268?, {{0x103586360, 0x103dcd060}, {0x103586360, 0x103dcd060}, {0x1034ae220, 0x1400086a1d8}, {0x1034ae220, 0x1400086a1e0}, ...}, ...)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:125 +0x20c
github.com/unidoc/unipdf/v3/contentstream.(*ContentStreamProcessor).Process(0x14000019670, 0x3c?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/contentstream/contentstream.go:623 +0xa84
github.com/unidoc/unipdf/v3/extractor.(*imageExtractContext).extractContentStreamImages(0x1400007ee60, {0x14000414100?, 0x1?}, 0x1034aed20?)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:912 +0x208
github.com/unidoc/unipdf/v3/extractor.(*Extractor).ExtractPageImages(0x14000430000, 0x1400086a038)
/path/go/pkg/mod/github.com/unidoc/unipdf/[email protected]/extractor/extractor.go:223 +0x68
`

@3ace
Copy link

3ace commented Sep 12, 2024

@crashingfish thanks for reporting this. Is it possible for you to attach a sample PDF document where this issue is happening?

@3ace 3ace added the bug Something isn't working label Sep 12, 2024
@crashingfish
Copy link
Author

@3ace Sorry. I would have already if it was allowed. However happy to provide any other information about the pdf that may help you reconstruct a similar file at your end.

@3ace
Copy link

3ace commented Sep 12, 2024

@crashingfish so far what we could gather is the PDF file contains an image file encoded using JBIG2 encoding based on the log you provided, but we couldn't replicate the issue as of now.

Is it possible for you to extract the image using the same encoding and send it to us so that we might be able to reconstruct similar file?

@crashingfish
Copy link
Author

@3ace I will need to check what that image contains and if it is allowed to be shared. Will get back.

@crashingfish
Copy link
Author

@3ace I am still checking on this. Meanwhile, would you be able to share your official email id?

@3ace
Copy link

3ace commented Sep 14, 2024

@crashingfish you can reach us trough [email protected]

@3ace
Copy link

3ace commented Sep 26, 2024

@crashingfish Hi, we haven’t received an email from you. Could you please confirm if you’ve sent it to us?

@unidoc unidoc deleted a comment Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants