Performance And Memory Usage #19

radix0000 · 2024-09-19T11:29:44Z

For the new updates version of pipeline, as part of performance improvements (which have been substantial), some trade-offs with memory usage have had to be made. Various critical data is loaded out of Elasticsearch into memory on startup (see caching.py) and saved back at end (this results in an approximately order of magnitude speed improvement). In future for larger datasets like UK PSC some optimisation will be need to keep the size of this in memory data within acceptable limits. There is significant scope for this, various strings that are being stored have only certain values and could be represented as integers for instance.

radix0000 mentioned this issue Sep 19, 2024

General Performance/Efficiency Improvements #14

Closed

radix0000 self-assigned this Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance And Memory Usage #19

Performance And Memory Usage #19

radix0000 commented Sep 19, 2024

Performance And Memory Usage #19

Performance And Memory Usage #19

Comments

radix0000 commented Sep 19, 2024