Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove unnecessary copies to reduce memory usage #11

Merged
merged 2 commits into from
Oct 14, 2024
Merged

Conversation

tmm1
Copy link

@tmm1 tmm1 commented Oct 14, 2024

tiktoken-node goes from 194mb -> 140mb as shown by heap -s <nodejs_pid> on macOS

tmm1 added 2 commits October 13, 2024 16:59
saves 36mb

before:
```
All zones: 2040325 nodes (194130656 bytes)

   COUNT      BYTES       AVG   CLASS_NAME                                        TYPE    BINARY
   =====      =====       ===   ==========                                        ====    ======
  922616   50154096      54.4   malloc in tiktoken::encoding::Encoding::new::h7d328f119207d7b8  C++     tiktoken-node.darwin-arm64.node
       7   23906096  3415156.5   malloc in hashbrown::raw::RawTable$LT$T$C$A$GT$::reserve_rehash::h393ae23f6aa25d82  C++     tiktoken-node.darwin-arm64.node
       4   23887872  5971968.0   malloc in hashbrown::raw::RawTable$LT$T$C$A$GT$::reserve_rehash::h95609d75e0ad1b85  C++     tiktoken-node.darwin-arm64.node
   17152   13309952     776.0   malloc in _$LT$regex_automata..meta..regex..Regex$u20$as$u20$core..clone..Clone$GT$::clone::hde024d24d6411e76  C++     tiktoken-node.darwin-arm64.node
```

after:
```
All zones: 1533995 nodes (158935920 bytes)

   COUNT      BYTES       AVG   CLASS_NAME                                        TYPE    BINARY
   =====      =====       ===   ==========                                        ====    ======
       7   23906096  3415156.5   malloc in hashbrown::raw::RawTable$LT$T$C$A$GT$::reserve_rehash::hf0fae7112435354d  C++     tiktoken-node.darwin-arm64.node
       4   23887872  5971968.0   malloc in hashbrown::raw::RawTable$LT$T$C$A$GT$::reserve_rehash::h50df8b64a2756414  C++     tiktoken-node.darwin-arm64.node
  461590   18764784      40.7   malloc in tiktoken::encoding::Encoding::new::hb3fc89062a07aa49  C++     tiktoken-node.darwin-arm64.node
```
we use a dedicated threadpool that's currenty 4 threads, so there's far too many slots

might also make sense to switch to real tls instead of the weird slots mechanism, since we know we're using real threads dedicated only to tokenization

saves about 18mb

before:
All zones: 1533995 nodes (158935920 bytes)

after:
All zones: 1501170 nodes (140258176 bytes)
@tmm1 tmm1 merged commit 7386c29 into main Oct 14, 2024
1 check passed
tmm1 added a commit that referenced this pull request Oct 17, 2024
partial revert of #11

was seeing weird perf regression in benchmarks

rust borrow checker makes it really hard for two things to own the same
data, so take the simple way out by letting CoreBPE own mergeable_ranks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant