-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issue: Points from geobuf polygons use more array capacity than needed, wasting memory. #122
Comments
TysonAndre
added a commit
to TysonAndre/geobuf
that referenced
this issue
Jan 17, 2022
Objects are modified in place, arrays are replaced with an array that only has exactly the amount of capacity needed. This is useful in cases where the polygons will be used for a long time. By default, arrays are reserved with extra capacity that won't be used. (The empty array starts with a capacity of 16 elements by now, which is inefficient for decoded points of length 2) slice() allocates a new array, seemingly with shrunken capacity according to process.memoryUsage. This has an optional option to deduplicate identical points, which may be useful for collections of polygons sharing points as well as for calling compress multiple times with different objects. It's only safe for read-only uses, so it is disabled by default. For example, in node-geo-tz issue 131, I saw this change to memory usage and decoding time on Linux. This is useful for long-running processes that repeatedly use the objects. 1. No Override: 1.280 GB (1.8 seconds) 2. Defaults for cache(no numericArrayCache): 0.708 GB (3.4 seconds) 3. Adding the second Map (numericArrayCache): 0.435 GB (6.7 seconds) Closes mapbox#122
TysonAndre
added a commit
to TysonAndre/geobuf
that referenced
this issue
Jan 17, 2022
Objects are modified in place, arrays are replaced with an array that only has exactly the amount of capacity needed. This is useful in cases where the polygons will be used for a long time. By default, arrays are reserved with extra capacity that won't be used. (The empty array starts with a capacity of 16 elements by now, which is inefficient for decoded points of length 2) slice() allocates a new array, seemingly with shrunken capacity according to process.memoryUsage. This has an optional option to deduplicate identical points, which may be useful for collections of polygons sharing points as well as for calling compress multiple times with different objects. It's only safe for read-only uses, so it is disabled by default. For example, in node-geo-tz issue 131, I saw this change to memory usage and decoding time on Linux. This is useful for long-running processes that repeatedly use the objects. 1. No Override: 1.280 GB (1.8 seconds) 2. Defaults for cache(no numericArrayCache): 0.708 GB (3.4 seconds) 3. Adding the second Map (numericArrayCache): 0.435 GB (6.7 seconds) Closes mapbox#122
TysonAndre
added a commit
to TysonAndre/geobuf
that referenced
this issue
Jan 17, 2022
Objects are modified in place, arrays are replaced with an array that only has exactly the amount of capacity needed. This is useful in cases where the polygons will be used for a long time. By default, arrays are reserved with extra capacity that won't be used. (The empty array starts with a capacity of 16 elements by now, which is inefficient for decoded points of length 2) slice() allocates a new array, seemingly with shrunken capacity according to process.memoryUsage. This has an optional option to deduplicate identical points, which may be useful for collections of polygons sharing points as well as for calling compress multiple times with different objects. It's only safe for read-only uses, so it is disabled by default. For example, in node-geo-tz issue 131, I saw this change to memory usage and decoding time on Linux. This is useful for long-running processes that repeatedly use the objects. 1. No Override: 1.280 GB (1.8 seconds) 2. Defaults for cache(no numericArrayCache): 0.708 GB (3.4 seconds) 3. Adding the second Map (numericArrayCache): 0.435 GB (6.7 seconds) Closes mapbox#122
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Arrays in nodejs need to be able to quickly add elements without resizing frequently, so they have both a size and a capacity.
For example, in the
geo-tz
module (providing time zone data for the entire world), geobuf will create'Polygon'
objects with readLinePart, and those arrays will be created with size 2, and excess capacity(16) that is never freed.Replacing
coords.push(p)
withcoords.push(p.slice())
in node_modules/geobuf/decode.js resulted in memory use of loading the entire quad tree from 1,282,134,016 to 528,089,088 for me (1.28GB to 0.53GB) in 64-bit node.js - the latter does not have excess capacityFrom babel/issues/6233
Note that
new Array(size)
would be worse for performance(runtime) due to js needing more arrays to represent arrays with mixes of types and the optimizer not being able to generate more efficient code. That should be avoided.Related to evansiroky/node-geo-tz#131
The text was updated successfully, but these errors were encountered: