Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tile or Scanline loading support #35

Open
ryanstout opened this issue Mar 8, 2021 · 6 comments
Open

Tile or Scanline loading support #35

ryanstout opened this issue Mar 8, 2021 · 6 comments

Comments

@ryanstout
Copy link

Forgive a noob question, I'm fairly new to rust. Looking at the source, I don't think it is, but figured I would ask. Is it possible to access raw image data without full loading the image into ram? (for example to load the image in tiles or by scanline or something, at least for .nef, .cr2, and .awr). I'm stacking raw files in different ways, and I can only load part of them in ram at once. (due to limited ram) Any help would be appreciated. If it's not possible, is it something that could be done with an easy modification? (I could probably fund some development if its something thats not a huge change) Thanks!

@pedrocr
Copy link
Owner

pedrocr commented Mar 8, 2021

Rawloader doesn't have any support for it and it's not something that can be done generally. There are formats where the last pixel depends on the values of every other pixel in the image and so decoding always has to decode the image until that point. Other formats are simpler and it's possible to go directly to a given block, line or even specific group of 2 or 3 pixels. A simple way to spot this in the code is if threaded decoding is done and how. From your examples ARW would be fully possible, CR2 hard and NEF not possible at all.

Depending on what you're doing that may not be what you want though. If you need RGB values you need to demosaic first. At a guess it's probably a better idea to do a pre-processing step where you chunk the images in however blocks you want without requiring special decoding which will always be a corner case that's hard to support. Something like:

  1. Define N processing blocks that are defined as a specific region of the image (line-based perhaps)
  2. Decode every image in sequence, preprocess it freely with normal code and then split it and append to a series of N files the blocks of that image
  3. Process the N files individually stacking and doing whichever calculations you care about producing an output file
  4. Merge the N output files into a single final output image

I'm curious about what kind of stacking you're doing. Astrophotography? imagepipe takes the output of rawloader and creates RGB images. That may be useful to you too:

https://github.com/pedrocr/imagepipe

Stop by #chimper on irc.freenode.org if you want to chat about this or other raw decoding/processing stuff.

@ryanstout
Copy link
Author

Thanks so much for the info. Currently I load up each image, then write out tiles to disk and then read those tiles in during the stacking. (Similar to what you described) We're doing exposure, focus stacking, and a few others. For us, we don't need to load scanlines or whatever out of order, its really just about not having the whole image in ram at once. (So we could for example take 3 images and process them top to bottom in sets of scanlines (at the DCT boundaries or something.) Is that something that would be possible I'm assuming you were referring to huffman coding on when you said "the last pixel depends on the values of every other pixel", or is there another limitation. Thanks!

@pedrocr
Copy link
Owner

pedrocr commented Mar 8, 2021

It's not impossible, but it would be hacky. Example for the formats you listed in increasing scale of the size of the hack:

  • ARW: it has fixed offsets so it would be simple to have a decoder function that's directly "give me the image at x+y/width+height
  • CR2: It's a tiled format so getting decodes per-tile would be relatively straightforward but it would require all images to be using the same tile boundaries, which may be true in practice but the format doesn't actually guarantee
  • NEF: it's encoded as huffman coded deltas for each pixel based on the previous ones, which means there are no fixed offsets or ways to start the decoding in the middle of the image even if you knew the offsets. It wouldn't be impossible to make it work by saving the state from the previous decode and being able to restart it later at that point.

What you described that you're already doing seems the best solution, as it avoids having to hack decoders and doesn't seem like it has too big of a downside. Maybe a bit of intermediate results saved to disk but that doesn't sound too bad.

@ryanstout
Copy link
Author

@pedrocr unfortunately, the disk read/write time adds up. (since they get paged out of ram) I'm not sure if its any easier, but what we really need is just something that yields scanlines or tiles, then we would basically open X number of photos at once and walk through them together in sets of scanlines or tiles. I'm guessing not, but let me know if you would be interested in some paid work to add support for that. ryan at witharsenal.com Thanks!

@ryanstout
Copy link
Author

@pedrocr Just wanted to follow up. Really we just need to be able to decode the image from start to finish in some sort of chunks, so we don't load the whole image into ram at once. (Our code is doing a stacking operation across multiple photos, but only looking at the same small area of each photo at once) I would think it could keep the decoding state around and do it all as a single pass. The actual tile shape doesn't really matter for our use case. (and just scanlines for example would work) Would it be possible to abstract things so instead of writing directly to a location in memory once a photosites value is decoded, it could write to something that could manage a tile buffer, then pass it to our code to process once that tile is done, then start decoding to the top left of the tile (for example)? (Sorry, I realize this might not be of use to anyone but me :-) Happy to pay for dev time at a good rate if its something you would be interested in working on it. Thanks!

@pedrocr
Copy link
Owner

pedrocr commented Apr 30, 2021

@ryanstout This isn't a very mainstream feature but it's not unreasonable and can fit in the API without bothering any other users. It seems even easy to have an API that has a generic implementation for every format that can be decoded threaded, which is actually most of them, so it doesn't seem like too much of a maintenance burden although it requires some rework.

I'm not really setup to do consulting work but the project is missing at least a CR3 decoder (see #23) as well as a decoder for the newest Fuji encodings. Assuming those are things that are important to you as well having that kind of contribution back would be incentive to work on things that are specific to your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants