Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserve before write_fmt for owned buffers #137762

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

thaliaarchi
Copy link
Contributor

@thaliaarchi thaliaarchi commented Feb 28, 2025

fmt::Arguments::estimated_capacity() is currently only used by fmt::format() to reserve the initial string capacity. Also use it in fmt::Write::write_fmt for String and OsString; and io::Write::write_fmt for Vec<u8>, VecDeque<u8>, Cursor<&mut Vec<u8>>, and Cursor<Vec<u8>>.

This may be worth checking perf.

@rustbot
Copy link
Collaborator

rustbot commented Feb 28, 2025

r? @tgross35

rustbot has assigned @tgross35.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 28, 2025
@thaliaarchi
Copy link
Contributor Author

r? @workingjubilee

@rustbot rustbot assigned workingjubilee and unassigned tgross35 Feb 28, 2025
@workingjubilee
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 28, 2025
…e-fmt, r=<try>

Reserve before `write_fmt`

`fmt::Arguments::estimated_capacity()` is currently only used for `fmt::format()` to reserve the initial string capacity. Also use it for `impl Write` for `Vec<u8>`, `VecDeque<u8>`, `Cursor<&mut Vec<u8>>`, and `Cursor<Vec<u8>>`.

This may be worth checking perf.
@bors
Copy link
Contributor

bors commented Feb 28, 2025

⌛ Trying commit 0e82180 with merge 9cdfd5d...

@bors
Copy link
Contributor

bors commented Feb 28, 2025

☀️ Try build successful - checks-actions
Build commit: 9cdfd5d (9cdfd5df41f0c63cf60717083c177a4af5071687)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9cdfd5d): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 10.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
10.1% [10.1%, 10.1%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 10.1% [10.1%, 10.1%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary 0.0%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.2%] 10
Regressions ❌
(secondary)
0.1% [0.1%, 0.2%] 37
Improvements ✅
(primary)
-0.1% [-0.2%, -0.0%] 7
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.0% [-0.2%, 0.2%] 17

Bootstrap: 770.19s -> 771.304s (0.14%)
Artifact size: 362.00 MiB -> 361.96 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2025
@thaliaarchi
Copy link
Contributor Author

thaliaarchi commented Feb 28, 2025

No regressions, so are we good?

Also, it probably shouldn't be rollup=never.

@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from 0e82180 to d8b6fe8 Compare February 28, 2025 09:14
@thaliaarchi
Copy link
Contributor Author

I had only considered the io::Write::write_fmt side of things, but it would also be useful to extend this to fmt::Write::write_fmt for String and OsString. I've pushed a commit for these.

@thaliaarchi
Copy link
Contributor Author

I think the fmt::Write side is more likely to have an impact. Let’s see.

@bors try @rust-timer queue

@bors
Copy link
Contributor

bors commented Feb 28, 2025

@thaliaarchi: 🔑 Insufficient privileges: not in try users

@rust-timer

This comment has been minimized.

@Noratrieb
Copy link
Member

rustdoc uses a lot of formatting, so this might have a positive impact there
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 28, 2025
@bors
Copy link
Contributor

bors commented Feb 28, 2025

⌛ Trying commit d8b6fe8 with merge ba1e464...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 28, 2025
…e-fmt, r=<try>

Reserve before `write_fmt`

`fmt::Arguments::estimated_capacity()` is currently only used by `fmt::format()` to reserve the initial string capacity. Also use it for `impl Write` for `Vec<u8>`, `VecDeque<u8>`, `Cursor<&mut Vec<u8>>`, and `Cursor<Vec<u8>>`.

This may be worth checking perf.
@bors
Copy link
Contributor

bors commented Feb 28, 2025

☀️ Try build successful - checks-actions
Build commit: ba1e464 (ba1e46459f988e4c6b39e6b34e656327b9e9fcd9)

@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from 9c2a9ae to 93f4a09 Compare March 2, 2025 03:16
@workingjubilee
Copy link
Member

I'm not entirely sure what you changed if anything in your last rebase, @thaliaarchi?

@thaliaarchi
Copy link
Contributor Author

thaliaarchi commented Mar 4, 2025

Since I rebased onto master at the same time, the changes got muddled; sorry. You can use git range-diff before...after to nicely compare rebases. The change was to use as_statically_known_str in io::Write::write_fmt like fmt::Write::write_fmt already does:

-    #[inline]
-    fn write_fmt(&mut self, fmt: fmt::Arguments<'_>) -> io::Result<()> {
-        self.reserve(fmt.estimated_capacity());
-        io::default_write_fmt(self, fmt)
+    fn write_fmt(&mut self, args: fmt::Arguments<'_>) -> io::Result<()> {
+        if let Some(s) = args.as_statically_known_str() {
+            self.write_all(s.as_bytes())
+        } else {
+            self.reserve(args.estimated_capacity());
+            io::default_write_fmt(self, args)
+        }
     }

I'm marking this PR as a draft for now, pending my investigation into a better heuristic for fmt::Arguments::estimated_capacity. My WIP hook for gathering stats is in my format-args-stats branch.

Since fmt::Arguments::estimated_capacity was barely used before, this PR essentially stress tests it for the first time.

@thaliaarchi thaliaarchi marked this pull request as draft March 4, 2025 01:02
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 5, 2025
…rsors, r=joboet

Override default `Write` methods for cursor-like types

Override the default `io::Write` methods for cursor-like types to provide more efficient versions.

Writes to resizable containers already write everything, so implement `write_all` and `write_all_vectored` in terms of those. For fixed-sized containers, cut out unnecessary error checking and looping for those same methods.

| `impl Write for T`              | `vectored` | `all` | `all_vectored` | `fmt`   |
| ------------------------------- | ---------- | ----- | -------------- | ------- |
| `&mut [u8]`                     | Y          | Y     | new            |         |
| `Vec<u8>`                       | Y          | Y     | new            | rust-lang#137762 |
| `VecDeque<u8>`                  | Y          | Y     | new            | rust-lang#137762 |
| `std::io::Cursor<&mut [u8]>`    | Y          | new   | new            |         |
| `std::io::Cursor<&mut Vec<u8>>` | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Vec<u8>>`      | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Box<[u8]>>`    | Y          | new   | new            |         |
| `std::io::Cursor<[u8; N]>`      | Y          | new   | new            |         |
| `core::io::BorrowedCursor<'_>`  | new        | new   | new            |         |

Tracked in rust-lang#136756.

# Open questions

Is it guaranteed by `Write::write_all` that the maximal write is performed when not everything can be written? Its documentation describes the behavior of the default implementation, which writes until a 0-length write is encountered, thus implying that a maximal write is expected. In contrast, `Read::read_exact` declares that the contents of the buffer are unspecified for short reads. If it were allowed, these cursor-like types could bail on the write altogether if it has insufficient capacity.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 5, 2025
…rsors, r=joboet

Override default `Write` methods for cursor-like types

Override the default `io::Write` methods for cursor-like types to provide more efficient versions.

Writes to resizable containers already write everything, so implement `write_all` and `write_all_vectored` in terms of those. For fixed-sized containers, cut out unnecessary error checking and looping for those same methods.

| `impl Write for T`              | `vectored` | `all` | `all_vectored` | `fmt`   |
| ------------------------------- | ---------- | ----- | -------------- | ------- |
| `&mut [u8]`                     | Y          | Y     | new            |         |
| `Vec<u8>`                       | Y          | Y     | new            | rust-lang#137762 |
| `VecDeque<u8>`                  | Y          | Y     | new            | rust-lang#137762 |
| `std::io::Cursor<&mut [u8]>`    | Y          | new   | new            |         |
| `std::io::Cursor<&mut Vec<u8>>` | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Vec<u8>>`      | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Box<[u8]>>`    | Y          | new   | new            |         |
| `std::io::Cursor<[u8; N]>`      | Y          | new   | new            |         |
| `core::io::BorrowedCursor<'_>`  | new        | new   | new            |         |

Tracked in rust-lang#136756.

# Open questions

Is it guaranteed by `Write::write_all` that the maximal write is performed when not everything can be written? Its documentation describes the behavior of the default implementation, which writes until a 0-length write is encountered, thus implying that a maximal write is expected. In contrast, `Read::read_exact` declares that the contents of the buffer are unspecified for short reads. If it were allowed, these cursor-like types could bail on the write altogether if it has insufficient capacity.
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 7, 2025
…rsors, r=joboet

Override default `Write` methods for cursor-like types

Override the default `io::Write` methods for cursor-like types to provide more efficient versions.

Writes to resizable containers already write everything, so implement `write_all` and `write_all_vectored` in terms of those. For fixed-sized containers, cut out unnecessary error checking and looping for those same methods.

| `impl Write for T`              | `vectored` | `all` | `all_vectored` | `fmt`   |
| ------------------------------- | ---------- | ----- | -------------- | ------- |
| `&mut [u8]`                     | Y          | Y     | new            |         |
| `Vec<u8>`                       | Y          | Y     | new            | rust-lang#137762 |
| `VecDeque<u8>`                  | Y          | Y     | new            | rust-lang#137762 |
| `std::io::Cursor<&mut [u8]>`    | Y          | new   | new            |         |
| `std::io::Cursor<&mut Vec<u8>>` | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Vec<u8>>`      | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Box<[u8]>>`    | Y          | new   | new            |         |
| `std::io::Cursor<[u8; N]>`      | Y          | new   | new            |         |
| `core::io::BorrowedCursor<'_>`  | new        | new   | new            |         |

Tracked in rust-lang#136756.

# Open questions

Is it guaranteed by `Write::write_all` that the maximal write is performed when not everything can be written? Its documentation describes the behavior of the default implementation, which writes until a 0-length write is encountered, thus implying that a maximal write is expected. In contrast, `Read::read_exact` declares that the contents of the buffer are unspecified for short reads. If it were allowed, these cursor-like types could bail on the write altogether if it has insufficient capacity.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 7, 2025
…rsors, r=joboet

Override default `Write` methods for cursor-like types

Override the default `io::Write` methods for cursor-like types to provide more efficient versions.

Writes to resizable containers already write everything, so implement `write_all` and `write_all_vectored` in terms of those. For fixed-sized containers, cut out unnecessary error checking and looping for those same methods.

| `impl Write for T`              | `vectored` | `all` | `all_vectored` | `fmt`   |
| ------------------------------- | ---------- | ----- | -------------- | ------- |
| `&mut [u8]`                     | Y          | Y     | new            |         |
| `Vec<u8>`                       | Y          | Y     | new            | rust-lang#137762 |
| `VecDeque<u8>`                  | Y          | Y     | new            | rust-lang#137762 |
| `std::io::Cursor<&mut [u8]>`    | Y          | new   | new            |         |
| `std::io::Cursor<&mut Vec<u8>>` | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Vec<u8>>`      | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Box<[u8]>>`    | Y          | new   | new            |         |
| `std::io::Cursor<[u8; N]>`      | Y          | new   | new            |         |
| `core::io::BorrowedCursor<'_>`  | new        | new   | new            |         |

Tracked in rust-lang#136756.

# Open questions

Is it guaranteed by `Write::write_all` that the maximal write is performed when not everything can be written? Its documentation describes the behavior of the default implementation, which writes until a 0-length write is encountered, thus implying that a maximal write is expected. In contrast, `Read::read_exact` declares that the contents of the buffer are unspecified for short reads. If it were allowed, these cursor-like types could bail on the write altogether if it has insufficient capacity.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 7, 2025
Rollup merge of rust-lang#137107 - thaliaarchi:io-optional-methods/cursors, r=joboet

Override default `Write` methods for cursor-like types

Override the default `io::Write` methods for cursor-like types to provide more efficient versions.

Writes to resizable containers already write everything, so implement `write_all` and `write_all_vectored` in terms of those. For fixed-sized containers, cut out unnecessary error checking and looping for those same methods.

| `impl Write for T`              | `vectored` | `all` | `all_vectored` | `fmt`   |
| ------------------------------- | ---------- | ----- | -------------- | ------- |
| `&mut [u8]`                     | Y          | Y     | new            |         |
| `Vec<u8>`                       | Y          | Y     | new            | rust-lang#137762 |
| `VecDeque<u8>`                  | Y          | Y     | new            | rust-lang#137762 |
| `std::io::Cursor<&mut [u8]>`    | Y          | new   | new            |         |
| `std::io::Cursor<&mut Vec<u8>>` | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Vec<u8>>`      | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Box<[u8]>>`    | Y          | new   | new            |         |
| `std::io::Cursor<[u8; N]>`      | Y          | new   | new            |         |
| `core::io::BorrowedCursor<'_>`  | new        | new   | new            |         |

Tracked in rust-lang#136756.

# Open questions

Is it guaranteed by `Write::write_all` that the maximal write is performed when not everything can be written? Its documentation describes the behavior of the default implementation, which writes until a 0-length write is encountered, thus implying that a maximal write is expected. In contrast, `Read::read_exact` declares that the contents of the buffer are unspecified for short reads. If it were allowed, these cursor-like types could bail on the write altogether if it has insufficient capacity.
@bors
Copy link
Contributor

bors commented Mar 7, 2025

☔ The latest upstream changes (presumably #138151) made this pull request unmergeable. Please resolve the merge conflicts.

@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from 93f4a09 to b86d755 Compare March 7, 2025 18:48
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Mar 14, 2025
…rsors, r=joboet

Override default `Write` methods for cursor-like types

Override the default `io::Write` methods for cursor-like types to provide more efficient versions.

Writes to resizable containers already write everything, so implement `write_all` and `write_all_vectored` in terms of those. For fixed-sized containers, cut out unnecessary error checking and looping for those same methods.

| `impl Write for T`              | `vectored` | `all` | `all_vectored` | `fmt`   |
| ------------------------------- | ---------- | ----- | -------------- | ------- |
| `&mut [u8]`                     | Y          | Y     | new            |         |
| `Vec<u8>`                       | Y          | Y     | new            | rust-lang#137762 |
| `VecDeque<u8>`                  | Y          | Y     | new            | rust-lang#137762 |
| `std::io::Cursor<&mut [u8]>`    | Y          | new   | new            |         |
| `std::io::Cursor<&mut Vec<u8>>` | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Vec<u8>>`      | Y          | new   | new            | rust-lang#137762 |
| `std::io::Cursor<Box<[u8]>>`    | Y          | new   | new            |         |
| `std::io::Cursor<[u8; N]>`      | Y          | new   | new            |         |
| `core::io::BorrowedCursor<'_>`  | new        | new   | new            |         |

Tracked in rust-lang#136756.

# Open questions

Is it guaranteed by `Write::write_all` that the maximal write is performed when not everything can be written? Its documentation describes the behavior of the default implementation, which writes until a 0-length write is encountered, thus implying that a maximal write is expected. In contrast, `Read::read_exact` declares that the contents of the buffer are unspecified for short reads. If it were allowed, these cursor-like types could bail on the write altogether if it has insufficient capacity.
@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from b86d755 to a1c03f3 Compare March 21, 2025 19:41
@thaliaarchi
Copy link
Contributor Author

Now that #138650 has merged, so the non-reserve part of this PR is in master, I want to see more clearly where we're at. Can we do another perf run?

@Noratrieb
Copy link
Member

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 21, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 21, 2025
…e-fmt, r=<try>

Reserve before `write_fmt` for owned buffers

`fmt::Arguments::estimated_capacity()` is currently only used by `fmt::format()` to reserve the initial string capacity. Also use it in `fmt::Write::write_fmt` for `String` and `OsString`; and `io::Write::write_fmt` for `Vec<u8>`, `VecDeque<u8>`, `Cursor<&mut Vec<u8>>`, and `Cursor<Vec<u8>>`.

This may be worth checking perf.
@bors
Copy link
Contributor

bors commented Mar 21, 2025

⌛ Trying commit a1c03f3 with merge 464aedc...

@bors
Copy link
Contributor

bors commented Mar 21, 2025

☀️ Try build successful - checks-actions
Build commit: 464aedc (464aedc3ca7d5137fe4f7c3c4b75302931e56aaf)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (464aedc): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.6% [0.2%, 1.1%] 8
Regressions ❌
(secondary)
0.8% [0.8%, 0.8%] 1
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [-0.4%, 1.1%] 9

Max RSS (memory usage)

Results (primary -3.2%, secondary -3.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.2% [-3.8%, -2.5%] 4
Improvements ✅
(secondary)
-3.2% [-3.5%, -2.9%] 2
All ❌✅ (primary) -3.2% [-3.8%, -2.5%] 4

Cycles

Results (primary 1.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [1.1%, 1.1%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.1% [1.1%, 1.1%] 1

Binary size

Results (primary 0.1%, secondary 0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.1%, 1.3%] 9
Regressions ❌
(secondary)
0.3% [0.0%, 0.4%] 38
Improvements ✅
(primary)
-0.1% [-0.1%, -0.0%] 14
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.1%, 1.3%] 23

Bootstrap: 775.351s -> 776.139s (0.10%)
Artifact size: 365.55 MiB -> 365.53 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 21, 2025
@thaliaarchi
Copy link
Contributor Author

The max RSS results look promising. Since this aims to reduce memory usage, maybe that overrules the other results? I'm not very experienced at interpreting perf results.

@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from a1c03f3 to 295e573 Compare March 22, 2025 00:16
@thaliaarchi
Copy link
Contributor Author

I dropped my commit which assumed that format_args! pieces cannot overflow isize::MAX, since I found a counterexample in #138811.

@thaliaarchi thaliaarchi marked this pull request as ready for review March 22, 2025 00:18
Reserve before formatting with `fmt::Arguments::estimated_capacity()` in
`fmt::Write::write_fmt` and `io::Write::write_fmt` implementations for
owned buffer types.

Adding `#[inline]` to `write_fmt` shows minor perf regressions, so leave
it off like the default impl.
@thaliaarchi thaliaarchi force-pushed the io-optional-methods/write-fmt branch from 295e573 to b0a8187 Compare March 22, 2025 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants