Skip to content

string serialization escaping optimisations #1273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 18, 2025

Conversation

conradludgate
Copy link
Contributor

@conradludgate conradludgate commented Jul 18, 2025

While serializing strings, we enter the hot loop checking 1 byte at a time. By massaging the code and using a little bit of unsafe, we can optimise this hot loop a bit.

This benchmark shows about a 2-5% speedup when working with string heavy documents

Benchmark was run with target x86_64-unknown-linux-gnu on an Intel Xeon Platinum 8375C CPU @ 2.90GHz, as well as on an Apple M4 Max.

use std::hint::black_box;

use criterion::{criterion_group, criterion_main, Criterion};
use serde_json::Value;

pub fn k8s(c: &mut Criterion) {
    // https://raw.githubusercontent.com/kubernetes/kubernetes/v1.33.3/api/openapi-spec/swagger.json
    let k8s = std::fs::read_to_string("benches/k8s-openapi.json").unwrap();
    let value: Value = serde_json::from_str(&k8s).unwrap();
    drop(k8s);

    let mut v = Vec::new();
    serde_json::to_writer_pretty(&mut v, &value).unwrap();

    c.bench_function("pretty", |b| {
        b.iter(|| {
            v.clear();
            serde_json::to_writer_pretty(&mut v, &value).unwrap();
            black_box(&v[..]);
        })
    });

    c.bench_function("compact", |b| {
        b.iter(|| {
            v.clear();
            serde_json::to_writer(&mut v, &value).unwrap();
            black_box(&v[..]);
        })
    });
}

criterion_group!(benches, k8s);
criterion_main!(benches);

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Would there be any benefit to coordinating the discriminants of CharEscape with the values that are being converted to/from it in this code, even if as cannot be used?

#[repr(u8)]
pub enum CharEscape {
    Quote = b'"',
    ReverseSolidus = b'\\',
    Solidus = b'/',
    Backspace = b'b',
    FormFeed = b'f',
    LineFeed = b'n',
    CarriageReturn = b'r',
    Tab = b't',
    AsciiControl(u8) = b'u',
}

@conradludgate
Copy link
Contributor Author

Thanks!

Would there be any benefit to coordinating the discriminants of CharEscape with the values that are being converted to/from it in this code, even if as cannot be used?

#[repr(u8)]
pub enum CharEscape {
    Quote = b'"',
    ReverseSolidus = b'\\',
    Solidus = b'/',
    Backspace = b'b',
    FormFeed = b'f',
    LineFeed = b'n',
    CarriageReturn = b'r',
    Tab = b't',
    AsciiControl(u8) = b'u',
}

I had that in a draft of this PR, but I couldn't get any benefit from it unfortunately. Maybe it still makes sense from a semantic point of view

@dtolnay dtolnay merged commit 623d9b4 into serde-rs:master Jul 18, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants