Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release RawStopTime earlier #141

Merged
merged 3 commits into from
Sep 27, 2023
Merged

Conversation

antoine-de
Copy link
Collaborator

linked to etalab/transport-validator#172

Memory consumption is too great because we do the parsing in 2 phases, first into a RawGtfs then into a Gtfs and during the conversion we have both structures in memory, resulting in doubling the peak memory needed. In the FR IDF dataset (~13 000 000 stop times), the RawGTFS takes 2.3 G or memory and the GTFs 2.1G, and the peak memory needed is ~3.9.

This PR add a reverse loop on the stop times and schrink to fit the vector (reverse iterating so not to allocate a new vector). Even if it's a naive implementation (we shouldn't have to schrink_to_fit it at every element), it seems the performance impact is negligible and on the IDF dataset /usr/bin/time measure goes from 3.9G to 3.4G

@Tristramg Tristramg merged commit 2a58597 into rust-transit:main Sep 27, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants