need to improve the get_file_list mechanism #338

randytpierce · 2024-02-28T17:24:19Z

The code uses a data set in couchbase that is retrieved by this query

SELECT url,
    mtime
FROM `vxdata`._default.METAR
WHERE subset = 'METAR'
    AND type = 'DF'
    AND fileType = 'grib2'
    AND originType = 'model'
    and model = 'HRRR_OPS'
    AND url IS NOT MISSING
    AND mtime IS NOT MISSING
order by url;

and this is turning out to be very inefficient. It showed up as a big difference in the capella evaluation tests. There is a document for each file that gets processed, essentially, and that is just too many documents. It should probably be an array of files for a kind of ingest or something.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-05-30T02:38:25Z

This issue is stale because it has been open 90 days with no activity.

github-actions · 2024-09-04T02:45:03Z

This issue is stale because it has been open 90 days with no activity.

randytpierce added bug Something isn't working couchbase VXingest issues related to the VXingest project labels Feb 28, 2024

randytpierce self-assigned this Feb 28, 2024

github-actions bot added the Stale label May 30, 2024

mollybsmith-noaa removed the Stale label Jun 5, 2024

github-actions bot added the Stale label Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

need to improve the get_file_list mechanism #338

need to improve the get_file_list mechanism #338

randytpierce commented Feb 28, 2024

github-actions bot commented May 30, 2024

github-actions bot commented Sep 4, 2024

need to improve the get_file_list mechanism #338

need to improve the get_file_list mechanism #338

Comments

randytpierce commented Feb 28, 2024

github-actions bot commented May 30, 2024

github-actions bot commented Sep 4, 2024