-
-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: bytes object is too large when using .sync_cube to save a trend stack #134
Comments
It seems you are trying to send a large non-Dask object to the calculation function. While we can mix lazy Dask and non-Dask arrays, it is ineffective to use a large non-Dask array in this way and is therefore prohibited. |
@AlexeyPechnikov I am using this part from your GoldenVallley example for a SBAS analysis:
I don't do anything extra, just reindexing the topo and incidence, because they are missing one row, on column. If I slice the first pair with some indexing along rows and cols, some very small subset (100x100 array) it works. Is there any way how can this be tackled with it? |
Are your |
WhenI verify if they are lazy Dask grids, from my understanding dask.Arrays, i get False statements for both of them. From your answer I understand that they must be Dask Arrays.
The problem is that those are previously saved interferograms files from where i load the correlation and unwrapped files opened with open_stack and open_cube, like in your colabs. Is that the issue? When I verify the file, it seems ok, chunked with the new grid chunks that I set at the beggining (also the chunks that were the files processed at first). |
The check should be
where |
Right now, I see your chunks are extremely small (512x512) instead of the default ones (2048x2048). It seems you made some wrong modifications in the processing and broke it. |
Hello @AlexeyPechnikov! I came back with update on this issue. The problem is still persisting after processing the data with the original chunks, the one with 2048. I am really stuck, I tried clean installations and so on. My guess is that the chunks are distributed in a small amount of graphs in the unwrap_sbas1.phase cube, like you can see in the pictures attached. |
A data cube is stored as a single large NetCDF file, while a data stack consists of separate files. This difference accounts for the variation in graph layers. In your case, you can try filtering out noisy pixels to simplify the regression calculation, for example, using |
You are right, using the regression formula from the |
Potentially, we can better remove phase ramps and stratified components using high-degree regression variables (remember Taylor decomposition of any function). Non-linear function approximation may require low or high-degree polynomials; some papers even use 4th-degree polynomials, although I find it excessive for everyday usage. At the same time, more regression variables, especially high-degree ones, can cause overfitting. Performance reasons also discourage using too many variables. There’s a trade-off between the benefits of higher-degree variables and a shorter list of variables. Regarding incidence angle fitting, it’s tricky and often ignorable, but it can be done if you really need it. |
A separate consideration about your raster: tiles processed independently can have inconsistencies between them. The best approach is to use a single tile, tuning the grid resolution for sufficient detail while maintaining processable sizes. Although PyGMTSAR can handle large grids, most practical cases don’t require it. Phase accuracy improves with lower resolution (multilooking) processing, and SNAPHU unwrapping performs better on lower resolution grids. Typically, SBAS analysis resolution is 60 meters due to strong physical reasons. For high-resolution results, use SBAS for the initial solution and apply phase corrections for precise PSI analysis with Sentinel-1 resolution, combining the strengths of both SBAS and PSI approaches. |
Hello, again Alexey! When trying to use PSI analysis for the same area, this time with interferograms computed like this, from Otmanbozdagh example:
i get the following error when I apply the 1D unwrapping. Even when I apply it for a small slice, in the temporal and spatial domaain, I get the following error:
|
Hmm, such reshaping applying for geocoding only. It can be an error raised before. |
The only thing that I did was that the sbas.psfunction() is not filtered based on the correlation stack, but the phase component is filtered (every component in part is filtered on some threshold). Is this the cause of such behavior? The other thing that comes in my mind that could have implications into processing is the wavelength of Gaussian filter. In the initial multilook interferogram processing I use a wavelength of 200 (I saw some indications of yours that the value should be around that) and in the singlelook I use a 60 (a smaller Gaussian filter window for PS Analysis). This last part should have any implications in such an error? |
Such error usually rises when we use incomplete DEM. If you sure the problem is related to 1D unwrapping try to use more pairs. |
Hello!
After i compute the trend regression and try to save it with .sync_cube function I receive a strange error in dask client.
From my basic understanding and some google searches its something related to file dimensions. But this behavior persists even when I try to save just the first pair from the stack. The stack is composed from 53x11138x7257 with a dimension per pair with 300mb and got a total dimension of 16 GBs. So, i don't think that the issues is directly related to filesizes.
Somehow, I can tackle this?
System and software version:
Server Specifications: 125 GB RAM, CPU: Intel(R) Xeon(R) Silver 4309Y CPU @ 2.80GHz
OS: Ubuntu 22.04.2 LTS (GNU/Linux 6.5.0-28-generic x86_64)
PyGMTSAR version: 2024.4.17.post2
The text was updated successfully, but these errors were encountered: