Remove unreachable #8

hkBst · 2025-05-14T10:27:47Z

This improves the API of this crate to not use unreachable any more and is the continuation of rust-lang/rust#138163.

It also eliminates internal unreachable by inlining Mode methods into the *_common functions, and eliminates the resulting duplication by using traits instead. Using traits is much more verbose than a macro-based variant, because they are very explicit, but hopefully also a bit less unclear.

I've tried to use separate commits to explain the story, but have probably only succeeded at the beginning.

There is a companion PR to use this new API here: rust-lang/rust#140999

r? @nnethercote

GuillaumeGomez · 2025-05-14T15:38:01Z

Gonna take a look later on.

nnethercote · 2025-05-21T02:56:22Z

Thanks for splitting this into multiple pieces. I think it's good that @GuillaumeGomez will look at this, it will be good to get another pair of eyes on it after rust-lang/rust#138163.

GuillaumeGomez · 2025-05-21T15:44:44Z

Changes look good to me. Now comes the not so fun question: can you add benchmarks please?

hkBst · 2025-05-21T15:57:40Z

Changes look good to me. Now comes the not so fun question: can you add benchmarks please?

Are you thinking about numbers (before the crate split off this was looking like this for the macro variant of this change) or code for this crate? I'm happy to come up with some benchmark code.

GuillaumeGomez · 2025-05-21T16:00:26Z

Mostly code, I can check locally when done. Considering it'll impact performance, better check ahead of time. We just need to check the entry functions.

hkBst · 2025-05-27T10:41:32Z

Benchmarks PR: #9

GuillaumeGomez · 2025-05-27T19:13:22Z

src/lib.rs

+/// Takes the contents of a literal (without quotes)
+/// and produces a sequence of errors,
+/// which are returned by invoking `error_callback`.
+pub fn unescape_for_errors(


What is the purpose of this function? It does the conversion but doesn't actually make use of it. It only provides information if one error occurred. Where is it meant to be used?

Its only purpose is to be used in the companion PR using the new API here: https://github.com/rust-lang/rust/pull/140999/files#diff-36d0ff95049fa1b66bdd47ec2c03e1588268303571a9561d1ba664ca29034dacR1019-R1049.

It seemed like a good compromise to remove a stubborn use of unescape_{unicode,mixed} and signals intent well. I suppose it could alternatively live where it is used instead of here, except for its use of unescape_single...

Oh I see, it only checks if there is an error. It doesn't need the unescaped content. Then let me make a suggestion for its documentation.

Right, it is like the other unescape_* functions, but it only gives you the error results and not the Oks. That's why I named it unescape_for_errors. Was the name not a good indication of this behavior?

Not really. Always open to interpretation. I needed your comment to understand why this function was working this way. Documentation is here to clarify that.

GuillaumeGomez · 2025-05-27T19:30:42Z

Also need to update the benchmarks.

GuillaumeGomez · 2025-05-28T13:48:15Z

src/lib.rs

+/// Takes the contents of a literal (without quotes)
+/// and produces a sequence of errors,
+/// which are returned by invoking `error_callback`.


Suggested change

/// Takes the contents of a literal (without quotes)

/// and produces a sequence of errors,

/// which are returned by invoking `error_callback`.

/// Takes the contents of a literal (without quotes) and calls `error_callback` if any error is encountered

/// while unescaping it. Please note that the unescaped content is not provided, this function is only meant

/// to be used to confirm whether or not the literal content is (in)valid.

I've renamed this to check_for_errors and improved its docs. Also took the chance to polish up some of the other doc comments. Let me know what you think.

GuillaumeGomez · 2025-05-29T14:30:59Z

API changes look good to me. Let me check benches now and then I think we're ready to go. :)

GuillaumeGomez · 2025-05-29T21:29:43Z

Here are the bench results:

test name	main branch	new changes	diff
bench_check_raw_byte_str	25,941.68 ns/iter (+/- 57.51)	48,097.89 ns/iter (+/- 104.59)	+85%
bench_check_raw_c_str_ascii	24,387.77 ns/iter (+/- 83.40)	39,709.25 ns/iter (+/- 116.80)	+62%
bench_check_raw_c_str_unicode	40,485.60 ns/iter (+/- 181.36)	50,116.99 ns/iter (+/- 115.94)	+23.7%
bench_check_raw_str_ascii	25,679.40 ns/iter (+/- 68.78)	39,727.86 ns/iter (+/- 137.46)	+54.7%
bench_check_raw_str_unicode	42,031.95 ns/iter (+/- 199.86)	47,429.04 ns/iter (+/- 121.53)	+12.8%
bench_skip_ascii_whitespace	6,321.42 ns/iter (+/- 70.54)	6,322.61 ns/iter (+/- 20.25)	0%
bench_unescape_byte_str_ascii	64,825.30 ns/iter (+/- 68.90)	57,407.75 ns/iter (+/- 272.16)	-11.4%
bench_unescape_byte_str_hex	95,602.13 ns/iter (+/- 214.39)	85,577.19 ns/iter (+/- 188.78)	-10.4%
bench_unescape_byte_str_trivial	34,165.35 ns/iter (+/- 114.53)	52,296.49 ns/iter (+/- 260.71)	+53%
bench_unescape_c_str_ascii	54,627.06 ns/iter (+/- 76.32)	48,477.48 ns/iter (+/- 214.53)	-10.7%
bench_unescape_c_str_hex_ascii	94,049.71 ns/iter (+/- 192.47)	86,601.92 ns/iter (+/- 226.26)	-7.9%
bench_unescape_c_str_hex_byte	94,030.54 ns/iter (+/- 170.87)	86,285.94 ns/iter (+/- 129.85)	-7.9%
bench_unescape_c_str_trivial	44,069.69 ns/iter (+/- 63.67)	37,109.07 ns/iter (+/- 89.56)	-15.8%
bench_unescape_c_str_unicode	183,698.16 ns/iter (+/- 236.18)	165,312.95 ns/iter (+/- 167.56)	-10%
bench_unescape_str_ascii	64,803.76 ns/iter (+/- 88.18)	66,926.66 ns/iter (+/- 228.30)	+3.3%
bench_unescape_str_hex	95,642.76 ns/iter (+/- 145.34)	102,020.68 ns/iter (+/- 140.15)	+6.7%
bench_unescape_str_trivial	34,071.43 ns/iter (+/- 54.39)	46,578.22 ns/iter (+/- 91.29)	+36.7%
bench_unescape_str_unicode	185,521.24 ns/iter (+/- 228.33)	180,438.70 ns/iter (+/- 428.62)	-2.7%

Overall, the check* functions are much slower. The unescape* ones are mixed but with some small gains and big regressions. More work is required to improve everything, we cannot merge as is.

hkBst · 2025-05-30T07:43:33Z

Interesting! Is there an easy way to create such a nice table?

GuillaumeGomez · 2025-05-30T07:54:39Z

Sadly no. I ran benches a lot of time in both main and in your branch and then kept the lowest +/- changes and finally computed the diff for all of them. So I recommend you do the same for main and then you can check with your branch.

hkBst · 2025-05-30T13:02:18Z

Taking the worst offender (bench_check_raw_byte_str), if I manually inline bench_check_raw (and so get rid of &mut dyn FnMut), then a lot (but not all) of the slowdown disappears. Unfortunately just annotating with #[inline(always)] does nothing.

On the other hand, if I make the main branch use the newer more generic bench_check_raw (which includes adding + ?Sized bounds to unescape_unicode and *_common), then it becomes just as slow as the new code.

Maybe I should rewrite the benchmarks as macros, to minimize such issues...

diff --git a/benches/benches.rs b/benches/benches.rs
index a028dfd..1100832 100644
--- a/benches/benches.rs
+++ b/benches/benches.rs
@@ -3,7 +3,9 @@
 extern crate test;
 
 use rustc_literal_escaper::*;
+use std::fmt::Debug;
 use std::iter::repeat_n;
+use std::ops::Range;
 
 const LEN: usize = 10_000;
 
@@ -37,6 +39,24 @@ fn bench_skip_ascii_whitespace(b: &mut test::Bencher) {
 // Check raw
 //
 
+#[allow(clippy::type_complexity)]
+fn new_bench_check_raw<UNIT: Into<char> + PartialEq + Debug + Copy>(
+    b: &mut test::Bencher,
+    c: UNIT,
+    check_raw: fn(&str, &mut dyn FnMut(Range<usize>, Result<UNIT, EscapeError>)),
+) {
+    let input: String = test::black_box(repeat_n(c.into(), LEN).collect());
+    assert_eq!(input.len(), LEN * c.into().len_utf8());
+
+    b.iter(|| {
+        let mut output = vec![];
+
+        check_raw(&input, &mut |range, res| output.push((range, res)));
+        assert_eq!(output.len(), LEN);
+        assert_eq!(output[0], (0..c.into().len_utf8(), Ok(c)));
+    });
+}
+
 fn bench_check_raw(b: &mut test::Bencher, c: char, mode: Mode) {
     let input: String = test::black_box(repeat_n(c, LEN).collect());
     assert_eq!(input.len(), LEN * c.len_utf8());
@@ -64,7 +84,20 @@ fn bench_check_raw_str_unicode(b: &mut test::Bencher) {
 
 #[bench]
 fn bench_check_raw_byte_str(b: &mut test::Bencher) {
-    bench_check_raw(b, 'a', Mode::RawByteStr);
+    //    bench_check_raw(b, 'a', Mode::RawByteStr);
+
+    new_bench_check_raw(b, 'a', |s, cb| unescape_unicode(s, Mode::RawByteStr, cb));
+
+    // let input: String = test::black_box(repeat_n('a', LEN).collect());
+    // assert_eq!(input.len(), LEN * 'a'.len_utf8());
+
+    // b.iter(|| {
+    //     let mut output = vec![];
+
+    //     check_raw_byte_str(&input, &mut |range, res| output.push((range, res)));
+    //     assert_eq!(output.len(), LEN);
+    //     assert_eq!(output[0], (0..1, Ok(b'a')));
+    // });
 }
 
 // raw C str
diff --git a/src/lib.rs b/src/lib.rs
index d315ed2..c381032 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -87,7 +87,7 @@ impl EscapeError {
 /// the callback will be called exactly once.
 pub fn unescape_unicode<F>(src: &str, mode: Mode, callback: &mut F)
 where
-    F: FnMut(Range<usize>, Result<char, EscapeError>),
+    F: FnMut(Range<usize>, Result<char, EscapeError>) + ?Sized,
 {
     match mode {
         Char | Byte => {
@@ -357,7 +357,7 @@ fn unescape_char_or_byte(chars: &mut Chars<'_>, mode: Mode) -> Result<char, Esca
 /// sequence of escaped characters or errors.
 fn unescape_non_raw_common<F, T: From<char> + From<u8>>(src: &str, mode: Mode, callback: &mut F)
 where
-    F: FnMut(Range<usize>, Result<T, EscapeError>),
+    F: FnMut(Range<usize>, Result<T, EscapeError>) + ?Sized,
 {
     let mut chars = src.chars();
     let allow_unicode_chars = mode.allow_unicode_chars(); // get this outside the loop
@@ -424,7 +424,7 @@ where
 /// only produce errors on bare CR.
 fn check_raw_common<F>(src: &str, mode: Mode, callback: &mut F)
 where
-    F: FnMut(Range<usize>, Result<char, EscapeError>),
+    F: FnMut(Range<usize>, Result<char, EscapeError>) + ?Sized,
 {
     let mut chars = src.chars();
     let allow_unicode_chars = mode.allow_unicode_chars(); // get this outside the loop

hkBst · 2025-05-31T14:40:27Z

With reduced benchmark overhead commit:

Original (main):
test bench_check_raw_byte_str ... bench: 25,216.16 ns/iter (+/- 372.96)
test bench_check_raw_c_str_ascii ... bench: 26,598.25 ns/iter (+/- 113.87)
test bench_check_raw_c_str_unicode ... bench: 40,294.48 ns/iter (+/- 80.19)
test bench_check_raw_str_ascii ... bench: 25,423.57 ns/iter (+/- 485.16)
test bench_check_raw_str_unicode ... bench: 42,309.82 ns/iter (+/- 1,200.37)

Remove Unreachable:
test bench_check_raw_byte_str_ascii ... bench: 26,611.50 ns/iter (+/- 191.69) (4% slower)
test bench_check_raw_c_str_ascii ... bench: 26,445.98 ns/iter (+/- 294.15) (same)
test bench_check_raw_c_str_unicode ... bench: 35,513.69 ns/iter (+/- 211.00) (12% faster)
test bench_check_raw_str_ascii ... bench: 28,275.46 ns/iter (+/- 78.26) (10% slower)
test bench_check_raw_str_unicode ... bench: 34,054.50 ns/iter (+/- 133.46) (14% faster)

It is annoying that the ascii path seems to actually be slower, as that is probably the hot path for a lot of code. Will investigate further.

Original (main):
test bench_skip_ascii_whitespace ... bench: 8,394.66 ns/iter (+/- 27.27)
test bench_unescape_byte_str_ascii ... bench: 66,282.21 ns/iter (+/- 894.67)
test bench_unescape_byte_str_hex ... bench: 95,132.18 ns/iter (+/- 1,351.86)
test bench_unescape_byte_str_trivial ... bench: 35,111.39 ns/iter (+/- 782.53)
test bench_unescape_c_str_ascii ... bench: 55,580.99 ns/iter (+/- 713.06)
test bench_unescape_c_str_hex_ascii ... bench: 95,237.37 ns/iter (+/- 1,174.20)
test bench_unescape_c_str_hex_byte ... bench: 95,190.68 ns/iter (+/- 1,317.40)
test bench_unescape_c_str_trivial ... bench: 46,248.13 ns/iter (+/- 390.16)
test bench_unescape_c_str_unicode ... bench: 207,910.14 ns/iter (+/- 2,214.58)
test bench_unescape_str_ascii ... bench: 66,227.90 ns/iter (+/- 870.02)
test bench_unescape_str_hex ... bench: 95,679.59 ns/iter (+/- 1,811.50)
test bench_unescape_str_trivial ... bench: 35,406.93 ns/iter (+/- 754.70)
test bench_unescape_str_unicode ... bench: 207,481.29 ns/iter (+/- 4,115.48)

Remove Unreachable:
test bench_skip_ascii_whitespace ... bench: 8,404.43 ns/iter (+/- 20.95) (same)
test bench_unescape_byte_str_ascii ... bench: 45,606.37 ns/iter (+/- 109.00) (30% faster)
test bench_unescape_byte_str_hex ... bench: 69,964.92 ns/iter (+/- 143.59) 27% faster)
test bench_unescape_byte_str_trivial ... bench: 31,843.06 ns/iter (+/- 169.52) (10% faster)
test bench_unescape_c_str_ascii ... bench: 50,902.69 ns/iter (+/- 968.88) (10% faster)
test bench_unescape_c_str_hex_ascii ... bench: 87,675.07 ns/iter (+/- 785.27) (10% faster)
test bench_unescape_c_str_hex_byte ... bench: 87,589.62 ns/iter (+/- 564.91) (10% faster)
test bench_unescape_c_str_trivial ... bench: 35,969.03 ns/iter (+/- 1,300.87) (20% faster)
test bench_unescape_c_str_unicode ... bench: 178,815.79 ns/iter (+/- 1,821.08) (10% faster)
test bench_unescape_str_ascii ... bench: 52,682.61 ns/iter (+/- 773.71) (20% faster)
test bench_unescape_str_hex ... bench: 90,624.67 ns/iter (+/- 1,252.18) (5% faster)
test bench_unescape_str_trivial ... bench: 31,834.04 ns/iter (+/- 580.67) (10% faster)
test bench_unescape_str_unicode ... bench: 182,553.44 ns/iter (+/- 3,555.20) (10% faster)

All of the unescape functions are significantly faster.

hkBst · 2025-06-01T14:11:42Z

The performance regression of test bench_check_raw_str_ascii is explored here: rust-lang/rust#141855

hkBst · 2025-06-06T09:02:59Z

I've switched back to a while-loop based check_raw to minimize perf difference with previous version.

@GuillaumeGomez can you please check perf again?

GuillaumeGomez · 2025-06-10T12:10:30Z

Please open a PR with the benchmark changes so I can compare the before/after. Currently I can't compare them since code changed in benchmarks.

GuillaumeGomez · 2025-06-10T12:16:22Z

Forgot to precise: no need to merge it, just need something to have a comparison with.

hkBst · 2025-06-10T14:53:27Z

Hmm, I guess there are some minor changes to the benchmarks, though nothing that should impact performance. But I'm happy to land benchmark changes first. It will give me a chance to include the mixed ascii/non-ascii cases.

GuillaumeGomez · 2025-06-10T14:54:32Z

Thanks! Let's make the benches as close as possible between main and this branch to make my life easier. =D

…remove unused Mode methods

hkBst · 2025-06-11T15:17:20Z

New benches for main in #13 and also pushed new benches here. Should be very similar now.

GuillaumeGomez · 2025-06-12T13:19:25Z

Perfect, thanks! Generating new numbers then. :)

GuillaumeGomez · 2025-06-12T14:54:07Z

Here are the new results:

bench name	current	new	diff
bench_check_raw_byte_str_ascii	25366.15 ns/iter (+/- 29.41)	21362.59 ns/iter (+/- 41.02)	-15.8%
bench_check_raw_c_str_ascii	23746.38 ns/iter (+/- 27.4)	20248.37 ns/iter (+/- 32.01)	-14.7%
bench_check_raw_c_str_non_ascii	40081.5 ns/iter (+/- 33.36)	37294.18 ns/iter (+/- 72.35)	-7.0%
bench_check_raw_c_str_unicode	126919.1 ns/iter (+/- 217.34)	96482.6 ns/iter (+/- 170.2)	-24.0%
bench_check_raw_str_ascii	25425.88 ns/iter (+/- 32.15)	19805.59 ns/iter (+/- 27.85)	-22.1%
bench_check_raw_str_non_ascii	41483.08 ns/iter (+/- 62.37)	35902.36 ns/iter (+/- 58.75)	-13.5%
bench_check_raw_str_unicode	134671.66 ns/iter (+/- 134.93)	109647.88 ns/iter (+/- 132.83)	-18.6%
bench_skip_ascii_whitespace	6327.46 ns/iter (+/- 20.09)	6305.62 ns/iter (+/- 5.21)	-0.3%
bench_unescape_byte_str_ascii	49183.21 ns/iter (+/- 83.22)	43504.72 ns/iter (+/- 124.99)	-11.5%
bench_unescape_byte_str_ascii_escape	79799.67 ns/iter (+/- 102.46)	49577.82 ns/iter (+/- 154.06)	-37.9%
bench_unescape_byte_str_hex_escape	111019.04 ns/iter (+/- 167.05)	65135.49 ns/iter (+/- 107.09)	-41.3%
bench_unescape_byte_str_mixed_escape	289452.57 ns/iter (+/- 313.3)	218387.45 ns/iter (+/- 540.27)	-24.6%
bench_unescape_c_str_ascii	64260.92 ns/iter (+/- 175.93)	54526.39 ns/iter (+/- 67.38)	-15.1%
bench_unescape_c_str_ascii_escape	74902.73 ns/iter (+/- 93.76)	66932.31 ns/iter (+/- 104.93)	-10.6%
bench_unescape_c_str_hex_escape_ascii	114123.31 ns/iter (+/- 201.84)	105388.06 ns/iter (+/- 198.93)	-7.7%
bench_unescape_c_str_hex_escape_byte	114360.32 ns/iter (+/- 152.44)	105959.69 ns/iter (+/- 95.08)	-7.3%
bench_unescape_c_str_mixed_escape	712783.1 ns/iter (+/- 1377.64)	657509.65 ns/iter (+/- 1073.35)	-7.8%
bench_unescape_c_str_non_ascii	85176.94 ns/iter (+/- 117.64)	76737.23 ns/iter (+/- 76.32)	-9.9%
bench_unescape_c_str_unicode	297242.9 ns/iter (+/- 571.13)	262374.38 ns/iter (+/- 349.01)	-11.7%
bench_unescape_c_str_unicode_escape	204612.39 ns/iter (+/- 222.81)	189687.64 ns/iter (+/- 202.25)	-7.3%
bench_unescape_str_ascii	49467.35 ns/iter (+/- 75.59)	45187.94 ns/iter (+/- 69.13)	-8.7%
bench_unescape_str_ascii_escape	80234.78 ns/iter (+/- 109.96)	67375.11 ns/iter (+/- 85.96)	-16.0%
bench_unescape_str_hex_escape	110639.72 ns/iter (+/- 68.65)	103836.11 ns/iter (+/- 94.97)	-6.1%
bench_unescape_str_mixed_escape	784409.9 ns/iter (+/- 5626.63)	735422.35 ns/iter (+/- 3384.8)	-6.2%
bench_unescape_str_non_ascii	70945.71 ns/iter (+/- 99.58)	63450.95 ns/iter (+/- 71.08)	-10.6%
bench_unescape_str_unicode	238028.35 ns/iter (+/- 416.81)	181429.58 ns/iter (+/- 247.16)	-23.8%
bench_unescape_str_unicode_escape	387049.9 ns/iter (+/- 498.05)	349056.78 ns/iter (+/- 480.37)	-9.8%

Only gains, all good. Just for info, I wrote this python script to generate these numbers:

Python script

import subprocess


class Bench:
    def __init__(self, line):
        self.name = line.split(' ')[1]
        self.speed = float(line.split('... bench:')[1].split('/iter')[0].strip().split(' ')[0].replace(',', ''))
        self.margin = float(line.split('(+/-')[1].split(')')[0].strip().replace(',', ''))


def run_bench():
    res = subprocess.run(["cargo", "+nightly", "bench"], check=True, stdout=subprocess.PIPE)
    return res.stdout.decode('utf-8')


def get_benches():
    print("> Retrieving benches...")
    out = run_bench()
    print("> Done")

    lines = out.split("\n")
    i = 0
    found_bench = False
    while i < len(lines):
        line = lines[i].strip()
        i += 1
        if line.startswith("test result:"):
            found_bench = True
            break
    if not found_bench:
        raise "Not found benchmarks in:\n{}".format(out)
    benches = {}
    while i < len(lines):
        line = lines[i].strip()
        i += 1
        if line.startswith("test result:"):
            break
        if not line.startswith("test bench_"):
            continue
        bench = Bench(line)
        benches[bench.name] = bench
    return benches


def get_good_results():
    benches = get_benches()

    for i in range(0, 30):
        new_benches = get_benches()
        biggest_margin = 0
        updates = 0
        for key in benches.keys():
            if new_benches[key].margin < benches[key].margin:
                benches[key] = new_benches[key]
                updates += 1
            if benches[key].margin > biggest_margin:
                biggest_margin = benches[key].margin
        if biggest_margin > 50.:
            print(">> Biggest margin (iteration {}, updates: {}): {}".format(
                i, updates, biggest_margin))
        else:
            break
    return benches


def get_str_for_bench(bench):
    return "{} ns/iter (+/- {})".format(bench.speed, bench.margin)


def show_results(benches):
    print("==== BENCHES RESULTS =====")
    for value in benches.values():
        print("{}: {}".format(value.name, get_str_for_bench(value)))


def show_comparison(current, new_ones):
    print("=== COMPARISON ===")
    print("| bench name | current | new | diff |")
    print("|-|-|-|-|")
    for key in current.keys():
        print("| {} | {} | {} | {:.1f}% |".format(
            key,
            get_str_for_bench(current[key]),
            get_str_for_bench(new_ones[key]),
            (new_ones[key].speed * 100. / current[key].speed) - 100.,
        ))


def main():
    subprocess.run(["git", "checkout", "hkBst/benches"], check=True)
    current = get_good_results()
    show_results(current)
    subprocess.run(["git", "checkout", "hkBst/remove_unreachable"], check=True)
    new_ones = get_good_results()
    show_results(new_ones)
    show_comparison(current, new_ones)


if __name__ == "__main__":
    main()

GuillaumeGomez · 2025-06-12T14:56:20Z

src/lib.rs

+/// Enum of the different kinds of literal
+#[derive(Debug, Clone, Copy, PartialEq)]
+pub enum Mode {
+    Char,


Please add a small example for each variant. For this one it would look like:

Suggested change

Char,

/// `'c'`

Char,

This enum is not new code and doing this is not as simple as it seems. My first attempt is:

pub enum Mode { /// `'a'` Char, /// `b'a'` Byte, /// `"hello"` Str, /// `r"hello"` RawStr, /// `b"hello \xff"` ByteStr, /// `rb"hello \xff"` RawByteStr, /// `c"hello \xff"` CStr, /// `rc"hello \xff"` RawCStr, }

but it seems to raise more questions than it solves, and I'm not really happy with it.

What I had in mind how how it could be created, like what's difference between a Char and a Byte. Which new questions does it bring?

Well, if I have six different strings of "hello" that only differ in the letters that come before the opening double quotes, then that seems repetitive. But if I add in escape sequences, then it seems weird if the difference between raw and non-raw is not explained. Before you know it, you've copied several sections of https://doc.rust-lang.org/reference/tokens.html#byte-string-literals.

alright I've pushed something

Perfect, thanks!

GuillaumeGomez · 2025-06-12T14:57:02Z

Just one nit and then it's ready to go, thanks for going along with me. :)

hkBst · 2025-06-12T15:43:21Z

Only gains, all good. Just for info, I wrote this python script to generate these numbers:

Yay, beautiful gains across the board! Python script looks handy. Should maybe think about adding it to cargo bench.

GuillaumeGomez · 2025-06-13T11:01:40Z

Looks all good to me, nice work!

Can you merge and make a new release please @Urgau ?

hkBst · 2025-06-13T11:04:30Z

@GuillaumeGomez I enjoyed working with you on this, so thanks for your cooperation!

Urgau · 2025-06-13T11:56:07Z

Would be good to have a accompagning rust-lang/rust PR, to make sure the API still works for rustc, as well to do a rustc-perf (just in case).

@nnethercote, I think this PR changed a bit since it was last assigned to you, would you like to take a new look?

GuillaumeGomez · 2025-06-13T11:57:54Z

Would be good to have a accompagning rust-lang/rust PR, to make sure the API still works for rustc, as well to do a rustc-perf (just in case).

We can't make a rustc-perf until this version has been released though. Or do you have another way in mind?

hkBst · 2025-06-13T12:11:43Z

Would be good to have a accompagning rust-lang/rust PR, to make sure the API still works for rustc, as well to do a rustc-perf (just in case).

There is: rust-lang/rust#140999. Rebased it on master earlier today, ran tests locally.

hkBst · 2025-06-13T12:16:12Z

We can't make a rustc-perf until this version has been released though. Or do you have another way in mind?

I believe this is actually possible if you use a git dep.

Urgau · 2025-06-13T12:24:34Z

Yes, I believe we can use a git dependency (tidy will complain, but we can ignore it).

GuillaumeGomez · 2025-06-13T12:37:10Z

Ah nice, that makes things a lot simpler. :)

update to literal-escaper 0.0.4 for better API without `unreachable` and faster string parsing This is the replacement for just the part of #138163 dealing with the changed API of unescape functionality, since that got moved into its own crate. This is a draft, because it uses an unpublished version of literal-escaper (rust-lang/literal-escaper#8). To test, clone literal-escaper into a folder next to rustc, and test rustc normally. r? `@nnethercote`

hkBst · 2025-06-13T13:00:38Z

Yes, I believe we can use a git dependency (tidy will complain, but we can ignore it).

Changing the path dep with git dep seems very simple, so I have now done this.

update to literal-escaper 0.0.4 for better API without `unreachable` and faster string parsing This is the replacement for just the part of #138163 dealing with the changed API of unescape functionality, since that got moved into its own crate. This is a draft, because it uses an unpublished version of literal-escaper (rust-lang/literal-escaper#8). To test, clone literal-escaper into a folder next to rustc, and test rustc normally. r? `@nnethercote`

GuillaumeGomez · 2025-06-14T09:53:10Z

Seems like the results of the perf check are good: rust-lang/rust#140999 (comment)

Urgau · 2025-06-14T10:16:06Z

Yeah, perf seems good.

Just waiting to know if @nnethercote wants to take another look. Will merge monday evening otherwise.

nnethercote · 2025-06-14T12:08:46Z

I'm happy if Guillaume is happy.

Urgau · 2025-06-14T12:59:46Z

Then let's merge it.

Urgau · 2025-06-14T13:04:24Z

Released as 0.0.4

hkBst mentioned this pull request May 14, 2025

update to literal-escaper 0.0.4 for better API without unreachable and faster string parsing rust-lang/rust#140999

Open

rust-cloud-vms bot force-pushed the remove_unreachable branch from f816e0b to 00b6cfd Compare May 14, 2025 11:23

nnethercote assigned nnethercote and GuillaumeGomez and unassigned nnethercote May 21, 2025

GuillaumeGomez reviewed May 27, 2025

View reviewed changes

rust-cloud-vms bot force-pushed the remove_unreachable branch from 00b6cfd to 45a5bf4 Compare May 28, 2025 09:27

GuillaumeGomez reviewed May 28, 2025

View reviewed changes

rust-cloud-vms bot force-pushed the remove_unreachable branch from f2df5ae to 40fc95a Compare June 1, 2025 10:05

rust-cloud-vms bot force-pushed the remove_unreachable branch 2 times, most recently from bb7bbba to 7bdde14 Compare June 6, 2025 09:00

hkBst added 4 commits June 11, 2025 15:13

replace check_raw_common with trait

115ae12

replace unescape_{char,byte} and check_non_raw_common with trait and …

3801218

…remove unused Mode methods

do not use Mode::* and move stuff around for better organisation

67eadd0

rename unescape_for_errors -> check_for_errors, and improve docs

702b0dc

rust-cloud-vms bot force-pushed the remove_unreachable branch from 7bdde14 to 702b0dc Compare June 11, 2025 15:14

GuillaumeGomez reviewed Jun 12, 2025

View reviewed changes

example literals for Mode

c9ae54e

GuillaumeGomez approved these changes Jun 13, 2025

View reviewed changes

Urgau merged commit f3f0220 into rust-lang:main Jun 14, 2025
2 checks passed

Remove unreachable #8

Remove unreachable #8

Uh oh!

Conversation

hkBst commented May 14, 2025

Uh oh!

GuillaumeGomez commented May 14, 2025

Uh oh!

nnethercote commented May 21, 2025

Uh oh!

GuillaumeGomez commented May 21, 2025

Uh oh!

hkBst commented May 21, 2025

Uh oh!

GuillaumeGomez commented May 21, 2025

Uh oh!

hkBst commented May 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hkBst May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GuillaumeGomez commented May 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GuillaumeGomez commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GuillaumeGomez commented May 29, 2025

Uh oh!

hkBst commented May 30, 2025

Uh oh!

GuillaumeGomez commented May 30, 2025

Uh oh!

hkBst commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hkBst commented May 31, 2025

Uh oh!

hkBst commented Jun 1, 2025

Uh oh!

hkBst commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GuillaumeGomez commented Jun 10, 2025

Uh oh!

GuillaumeGomez commented Jun 10, 2025

Uh oh!

hkBst commented Jun 10, 2025

Uh oh!

GuillaumeGomez commented Jun 10, 2025

Uh oh!

hkBst commented Jun 11, 2025

Uh oh!

GuillaumeGomez commented Jun 12, 2025

Uh oh!

GuillaumeGomez commented Jun 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

hkBst May 28, 2025 •

edited

Loading

GuillaumeGomez commented May 29, 2025 •

edited

Loading

hkBst commented May 30, 2025 •

edited

Loading

hkBst commented Jun 6, 2025 •

edited

Loading

hkBst commented Jun 13, 2025 •

edited

Loading

hkBst commented Jun 13, 2025 •

edited

Loading