Missing Observations #176

mccabete · 2025-01-16T20:34:33Z

@zebbecker and @eorland noticed that some of the fires this year seemed to be growing only every 24 hours. Zeb made the below figure.

mccabete · 2025-01-16T20:36:05Z

I plotted the "new_pixels" extracted from the API itself. Looks like the observations from 8th pm and 9th pm were not included in the algorithm at all.

zebbecker · 2025-02-02T12:31:32Z

I checked it out, and the perimeters I get from the API are the same as those I get from both the combined_largefires and the allfires files read directly from s3. I had been thinking that maybe something was wrong with how we save off the largefires file from allfires during postprocessing, but this does not support that idea. It also doesn't support the idea that data is being mangled somewhere in the API upload pipeline.

zebbecker · 2025-02-02T12:50:47Z

Confirmed that observations for 1/8 PM and 1/9 PM are missing from the allpixels file on s3 as well.

Here, pixels from the allpixels file directly from S3 are plotted in blue over NOAA20 detections in red.

zebbecker · 2025-02-03T13:50:23Z

The red figure on the bottom here shows raw NOAA20 detections for 1/9 PM. The blue dots are what made it into allpixels. As you can see, all observations west of a certain point are missing.

On 1/8 PM, allpixels has no observations at all, over all of CONUS.

This makes me wonder if we somehow processed those days before we had downloaded full datasets of NOAA20 observations. If so, it makes sense that there would be gaps in the record- even though we now have the full NOAA20 observations, we wouldn't automatically go back and reprocess them as part of the typical NRT runs, which are only forward looking.

mccabete · 2025-02-03T14:35:24Z

Yes, I think you've found it. We never fully fixed this issue --- that we essentially need to run the algorithm multiple times a day regenerating files until maybe 12 hours later to account for time differences and upload differences across the US.

mccabete · 2025-02-03T14:38:12Z

This is especially annoying because it seems like it's sensor dependent --- and we aren't certain how variable the upload timing is.

eorland · 2025-02-03T20:14:21Z

@zebbecker @mccabete, thanks for diving into this. Really a frustrating issue. Let me see if I'm interpreting this correctly:

In an idealized world, we would know when VIIRS data for a given region would be 'complete' for a given day, and then time FEDS runs accordingly. This is currently unrealistic as we may never have a good sense of the timing.
Run FEDS for a given period up until about 12-24hrs after an arbitrary timestep of interest (e.g., Jan 1, 2025 AM gets run up until Jan 2, 2025 AM).

I'm trying to remember how we're currently doing the NRT runs, but could we swap the run instructions to make tst 24hrs before the present timestamp?

mccabete · 2025-02-04T14:05:17Z

@eorland huh that's an interesting thought! I think it's a good idea, with the one detail that right now I think the algorithm is very aggressive about not regenerating files it doesn't need to. I think we'd need to tweak that so that if the file is close in time to the present (-24 hours like you suggest) the algorithm is ok with regenerating it -- I'm not sure if this will clog up our runs though.

zebbecker · 2025-02-05T12:28:55Z

Copying Yang's input here to have it all in one place for future reference:

"So based on discussions in the github issue page, it seems those tracking errors were mainly due to the operational FEDS run before the complete recording of VIIRS active fires in the FIRMS files. For most of the cases, the current time lag setting (can you remind me the current UTC/local time scheduled to run the FEDS code?) ensures the availability of VIIRS data. But sometimes the FIRMS data may be delayed and FEDS code may use incorrect/incomplete data. While we are unable to correct this for the latest time step, we may fix it by performing a 'remedy' run for the previous time step (assuming the data for t-12hrs have been complete at the current time). To make it more efficient, we may also do an initial check to see if the fire pixels for t-12hr from FIRMS are different from the preprocessed local data. If there's no change, the 'remedy' run may not be necessary. If there's change, we can extract the VIIRS data for t-12hr, re-run the code, and overwrite the output (from the previous 'normal' run) for that time step. The backward time span for this kind of 'remedy' run may be extended to more than 1 time step, considering the possibility of longer FIRMS data delay.

While this is indeed an upstream data issue, we probably can minimize the negative effect in this way.

BTW, does the current operational run only use NOAA 20 data?"

zebbecker · 2025-02-05T13:01:10Z

To answer specific questions:

Yes, the current NRT runs use only NOAA20 data. We turned off SNPP after a string of issues and outages over the summer. We need to write some logic for handling observations from multiple satellites. For starters, we should be resilient to issues with a single satellite and able to keep processing NRT runs with what we have. Currently, in contrast, if data from any one of the satellites isn't available, NRT runs will just crash. As a side note, we also need to include NOAA21 in the ingest process (see Include NOAA-21 into Ingest workflow #54), and as we've discussed it would be good to think about if there is anything clever we can do to extract additional information from the succession of 3 overpasses one after the other, and at slightly different view angles.
NRT run times depend on the region. CONUS currently runs at 13:25 and 23:25 UTC. (10:25 and 18:25 EST). This already provides a pretty large buffer for delays.

In general:

I like the idea of essentially re-running (forcibly) firetracking from time t-N up to t when we process t. This would help provide additional resiliency to unexpected delays in data downlinking that are outside of our control. If we assume that the full record might not be available when we run for the first time, and therefore are prepared to update our fire tracking with subsequent observations, we can also move our initial NRT runs closer to actual overpass times, reducing latency by several hours in the best case. As long as we incorporate Yang's suggestion of checking if a "remedy run" is actually necessary before doing them, I suspect that we will not significantly decrease performance/clog up runs with this strategy, as we won't need to recompute most timesteps at all. Only when something has gone wrong.

zebbecker added bug Something isn't working next-release Target for next release global feds and removed next-release Target for next release labels Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing Observations #176

Missing Observations #176

mccabete commented Jan 16, 2025

mccabete commented Jan 16, 2025

zebbecker commented Feb 2, 2025

zebbecker commented Feb 2, 2025 •

edited

Loading

zebbecker commented Feb 3, 2025

mccabete commented Feb 3, 2025

mccabete commented Feb 3, 2025

eorland commented Feb 3, 2025 •

edited

Loading

mccabete commented Feb 4, 2025

zebbecker commented Feb 5, 2025

zebbecker commented Feb 5, 2025

Missing Observations #176

Missing Observations #176

Comments

mccabete commented Jan 16, 2025

mccabete commented Jan 16, 2025

zebbecker commented Feb 2, 2025

zebbecker commented Feb 2, 2025 • edited Loading

zebbecker commented Feb 3, 2025

mccabete commented Feb 3, 2025

mccabete commented Feb 3, 2025

eorland commented Feb 3, 2025 • edited Loading

mccabete commented Feb 4, 2025

zebbecker commented Feb 5, 2025

zebbecker commented Feb 5, 2025

zebbecker commented Feb 2, 2025 •

edited

Loading

eorland commented Feb 3, 2025 •

edited

Loading