Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tput improvements #13

Merged
merged 12 commits into from
Aug 20, 2024
Merged

tput improvements #13

merged 12 commits into from
Aug 20, 2024

Conversation

a10y
Copy link
Contributor

@a10y a10y commented Aug 16, 2024

Improvements in throughput and allocations

  • Eliminated the old find_longest_symbol stuff and rewrote compress_count to just use compress, 5x speedup for the train benchmark
  • A couple of allocations tricks, including replacing vec![Symbol::EMPTY; N] with vec![0u64; N] and then transmuting, saving N calls to Symbol.clone and replacing 2D with 1D vector (allows us to use the vec! specialization for creating a vector of all zeros)

here's the output of running both the throughput_fast and throughput_slow files

```
aduffy@DuffyProBook /V/C/fsst (aduffy/improve-throughput)> cargo run --release --example throughput_fast
    Finished `release` profile [optimized] target(s) in 0.03s
     Running `target/release/examples/throughput_fast`
building a simple symbol table
building new text array of 1073741824 bytes
beginning compression benchmark...
test completed
compression ratio: 3.7142857172507413
wall time = 1.323992917s
tput: 810987589.2938784 bytes/sec
aduffy@DuffyProBook /V/C/fsst (aduffy/improve-throughput)> cargo run --release --example throughput_slow
   Compiling fsst-rs v0.1.0 (/Volumes/Code/fsst)
    Finished `release` profile [optimized] target(s) in 0.15s
     Running `target/release/examples/throughput_slow`
building a simple symbol table
building new text array of 1073741824 bytes
beginning compression benchmark...
test completed
compression ratio: 0.5
wall time = 3.814265125s
tput: 281506866.6733019 bytes/sec
```

It seems like when we have a lot of escape codes, we're considerably (~4x) slower than when we have
a lot of code table hits. Need to dig into this.
@a10y

This comment was marked as outdated.

@@ -436,11 +459,11 @@ impl Compressor {

// SAFETY: `end` will point just after the end of the `plaintext` slice.
let in_end = unsafe { in_ptr.byte_add(plaintext.len()) };
let in_end_sub8 = unsafe { in_end.byte_sub(8) };
let in_end_sub8 = in_end as usize - 8;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miri caught that the old thing was technically a dangling ptr.

i never dereferenced it but just to make it happy i work with it as an address instead of as a pointer

@a10y a10y changed the title throughput is dependent on hits/misses tput improvements Aug 20, 2024
Self {
bytes: [value, 0, 0, 0, 0, 0, 0, 0],
}
Self { num: value as u64 }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to compile to the same thing in all but debug mode (godbolt link) but probably just clearer this way anyway

@a10y a10y marked this pull request as ready for review August 20, 2024 16:42
@a10y a10y merged commit 2d8db1a into develop Aug 20, 2024
2 checks passed
@a10y a10y deleted the aduffy/improve-throughput branch August 20, 2024 17:45
@github-actions github-actions bot mentioned this pull request Aug 20, 2024
a10y pushed a commit that referenced this pull request Aug 20, 2024
## 🤖 New release
* `fsst-rs`: 0.1.0 -> 0.2.0

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

## [0.2.0](v0.1.0...v0.2.0) -
2024-08-20

### Other
- tput improvements ([#13](#13))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/MarcoIeni/release-plz/).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@a10y a10y mentioned this pull request Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant