Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max array length constant doesn't seem to match actual limits in browsers #1178

Closed
domoritz opened this issue Dec 11, 2023 · 7 comments
Closed
Labels
Bug Something isn't working.

Comments

@domoritz
Copy link
Contributor

domoritz commented Dec 11, 2023

Description

Encountered an error when I tried to allocate a large array with the size in https://github.com/stdlib-js/constants-array-max-typed-array-length.

new Uint8Array(9007199254740991)

Doesn't work in any browser (Firefox, Safari, Chrome) or Node.

Environments

Node.js, Firefox, Chrome, Safari

@domoritz domoritz added the Bug Something isn't working. label Dec 11, 2023
@domoritz
Copy link
Contributor Author

Allocating a large array also fails since the buffer cannot be allocated.

typedarray(2 ** 31, "uint8")

RangeError: Array buffer allocation failed in Chrome for example.

@kgryte
Copy link
Member

kgryte commented Dec 11, 2023

@domoritz Thanks for filing this issue. The 2**53-1 value comes from the ECMAScript specification:

Now, whether browser engines actually support arrays of up to $2^{53}-1$ is another matter. Converted to TB, allocating a Uint8Array with $2^{53}-1$ bytes would translate to over 8,000TB, which is beyond what browsers can handle. 😅

And, FWIW, I don't know if we can reliably ascertain a universal max typed array length, as this is likely to be engine/platform-specific.

typedarray(2 ** 31, "uint8")

When I run this from the stdlib REPL (backed by Node.js/V8), this works for me. That you are getting an allocation failure seems to lend credence to the idea that this is going to be implementation-dependent.

In [32]: var x = typedarray(2**31, "uint8")
Out[32]: Uint8Array(2147483648) [
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0,
  ... 2147483548 more items
]

To help me understand a bit better your use case, what was the situation you faced where you are trying to allocate large arrays, including up to $2^{53}-1$ elements?

@kgryte
Copy link
Member

kgryte commented Dec 11, 2023

For dynamically resolving the max practical typed array length, we could add a utility to stdlib to help with this. If you think this would be helpful, LMK.

@domoritz
Copy link
Contributor Author

domoritz commented Dec 12, 2023

I came across this as I am adding support for LargeUTF8 to Arrow in apache/arrow#35780. In these string vectors, we have one buffer for offsets and one for data. The offset buffer is a BigUint64Array and so offsets can be larger than the $2^{32}$ bits we usually have in utf8 vectors. I totally agree that no-one is really going to have $2^{53}$ (which btw is the same as Number.MAX_SAFE_INTEGER) but more than $2^{32}$ is totally reasonable. So I started implementing a chunked typed array that mostly implements the APIs of native typed arrays and it seems to work reasonably well. However, I was wondering whether someone had done something similar and so I came across https://github.com/stdlib-js/constants-array-max-typed-array-length. So I'm not a user but was curious what the purpose of this package was. I emailed @Planeshifter about my chunked array questions btw.

I guess I don't really understand the purpose of that package when it's not actually the limit any engine supports and also way too large for any reasonable use case. It feels like using this constant from the package would never be useful since limits would be hit much earlier.

@kgryte
Copy link
Member

kgryte commented Dec 12, 2023

@domoritz The package (and associated value) is used for testing (a) whether something is a valid typed-array-like object and (b) whether a value is a valid typed array index. We do the same for checking for array-like objects, but there the limit is the max Int32.

While the limit for typed arrays will never be hit in practice, at least according to the spec, it is allowed, and so when we check whether something is a more general collection, we are obligated to honor the spec. E.g., implementing a sparse array having length $2^{53}-1$ should be permitted, and so long as typed array-like (indexable and with a limited set of properties), our generalized APIs should work in (most) instances.

@domoritz
Copy link
Contributor Author

I see. For testing makes sense. Do you have plans for implementing an array type that is larger than the current implementation-defined limits?

I'll close the issue since there isn't a big issue here.

@kgryte
Copy link
Member

kgryte commented Dec 12, 2023

@domoritz No immediate plans, but also not opposed. I've toyed a bit with extending stdlib ndarrays in a manner similar to how Dask extends the concept of a NumPy ndarray.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working.
Projects
None yet
Development

No branches or pull requests

2 participants