Add fixed-sized integer decodings? #107

repnop · 2022-07-15T19:51:48Z

repnop
Jul 15, 2022

it's me again :D I didn't necessarily want to make this an issue yet to see what some possible ideas could be for improving my situation below

I've been toying around with a high level SNMP (specifically v2c) crate that has been an off/on side-project at work for quite a while to see if I can move some of our infrastructure over from Python, and while working on that I've been trying to get a sense of what areas I need to keep an eye on performance-wise to keep parity with the NetSNMP Python library that we use, and one of the things that I noticed when doing perf analysis is that a lot of time is spend decoding + allocating the big integer type, which is understandable as a default since theoretically integers can be of any size in ASN.1, however, in SNMP they're pretty much entirely fixed-width integer types, which means that always decoding into a big integer type that always allocates (which is the case with BigInt right now) incurs a lot of overhead. I did some preliminary work last week and added some stuff to the current benchmark:

diff --git a/benches/common.rs b/benches/common.rs
index 9d4e9fc..2be68c0 100644
--- a/benches/common.rs
+++ b/benches/common.rs
@@ -19,6 +19,16 @@ pub struct Bench {
     n: Utf8String,
     o: UtcTime,
     p: GeneralizedTime,
+    q: u8,
+    r: u16,
+    s: u32,
+    t: u64,
+    u: usize,
+    v: i8,
+    w: i16,
+    x: i32,
+    y: i64,
+    z: isize,
 }

then implemented specific decode_<ty> methods to the various integer types to Decoder and saw an approximate ~23% performance improvement before adding them vs after on the benchmark for the various encodings when forwarding the types to their decode_<ty> function instead of always going through BigInt then converting. (I'm on a Framework with the 11th Gen Intel i7-1165G7 (8) @ 4.700GHz CPU with 32 GiB RAM, running Arch) I think that this confirms my suspicion that figuring out some way to not always heap allocate is going to be somewhat necessary to reach close to performance parity with NetSNMP because the tables that we need to walk are sometimes massive and all of the intermediate allocations really begin to add up even though network latency is the biggest issue by far.

but is adding those decode_<ty> trait methods the right direction do you think? or is there potentially some other way which we can expose this behavior without creating a performance footgun accidentally?

XAMPPRocky · 2022-07-15T20:15:02Z

XAMPPRocky
Jul 15, 2022
Maintainer

Thank you for your question! Yes, this an optimisation I'd like to have implemented after PER support has been merged. My current thought was to keep the same decode_integer method, and make the return type generic over all integer types. Which more complex to implement, but I think it would be nicer than having a method for each kind as that would reduce boilerplate and work better with constraints which allow integers of arbitrary sizes not just the ones supported by rust which is needed for PER support.

As an aside, I'd love to see you benchmark the current version of rasn against NetSNMP, because in my own benchmarking, rasn is only a couple microseconds slower than x509-parser, and orders of magnitude faster than any of the ASN.1 implementations in Python.

3 replies

repnop Jul 21, 2022
Author

My current thought was to keep the same decode_integer method, and make the return type generic over all integer types. Which more complex to implement, but I think it would be nicer than having a method for each kind as that would reduce boilerplate and work better with constraints which allow integers of arbitrary sizes not just the ones supported by rust which is needed for PER support.

ah, that sounds perfectly fine to me :) if there's any part of that I can assist with, let me know!

As an aside, I'd love to see you benchmark the current version of rasn against NetSNMP

hmm, I imagine that's definitely doable as microbenchmarks to try and eliminate any networking latency from the equation, I'll see what I can cobble together then sometime and let you know! the Python bindings we use are pretty old, but I'd reckon a big part of any real-world comparison is going to be around the walk/bulkwalk efficiency for navigating the OID tree, but I'm hoping I can give it a run for its money with some fancy codegen for table types :)

XAMPPRocky Jul 21, 2022
Maintainer

I'd love to know more about the OID tree stuff, since that stuff could be shared with others.

repnop Jul 22, 2022
Author

in SNMP, all of the available information about a given system is contained within the OID tree, each root OID has branches which then can contain more OIDs, which you're probably familiar with (1.3 is a branch of the ISO root 1). SNMP uses walking mechanisms to gather the information from the remote agents where each OID (potentially) contains an associated value, e.g. 1.3.6.1.2.1.1.1 is the sysDescr OID which returns an OCTET STRING containing the system description which was set for the agent. single values like that usually aren't terribly useful except for using reads/writes to trigger behavior, as well as for one-off things.

most of the usefulness is contained within SNMP tables which are a collection of OIDs that form a row, with each individual OID representing a single column value. a good example is the ifTable (the NetSNMP site has a decent page on it, especially the tree view at the bottom) which contains a row for every interface that's available on the system, such as an ethernet or coaxial interface in a cable modem, for example. so as you can imagine, when you have devices with large amounts of interfaces, retrieving all of that information can be pretty costly, especially when most SNMP agents aren't exactly the fastest or most reliable things in the world :D

in SNMPv1 there was only GetNext requests which is how you'd walk the OID tree to fetch all of the information, but SNMPv2 added the GetBulk request which you can use to more efficiently gather large chunks of information all at once instead of going one-by-one, which means that now the time that's spent parsing matters a lot more since there's not quite as much network overhead.

hopefully that gives you a vague idea of what the process is like, but if you have any questions I can try to explain more :)

okay so at the last minute, I think I fixed the benchmark!

fn rasn_pdu_decode(bytes: &[u8]) -> rasn_snmp::v2::Pdus {
    rasn::ber::decode(bytes).unwrap()
}

struct PduWrapper {
    pdu: *mut netsnmp_benchmark::netsnmp_pdu,
}

impl Drop for PduWrapper {
    fn drop(&mut self) {
        unsafe { netsnmp_benchmark::snmp_free_pdu(self.pdu) };
    }
}

fn netsnmp_pdu_decode(bytes: &[u8]) -> PduWrapper {
    let pdu = unsafe { netsnmp_benchmark::snmp_pdu_create(0) };
    let mut length = PACKET.len() as u64;
    let rc = unsafe { netsnmp_benchmark::snmp_pdu_parse(pdu, bytes.as_ptr() as *mut _, &mut length) };
    if rc != 0 {
        panic!("parsing failed");
    }
    PduWrapper { pdu }
}

fn criterion_benchmark(c: &mut Criterion) {
    c.bench_function("rasn decode", |b| b.iter_with_large_drop(|| black_box(rasn_pdu_decode(PACKET))));
    c.bench_function("netsnmp decode", |b| b.iter_with_large_drop(|| black_box(netsnmp_pdu_decode(PACKET))));
}

rasn decode             time:   [38.376 µs 38.450 µs 38.530 µs]                         
                        change: [-0.6900% +0.0083% +0.7325%] (p = 0.98 > 0.05)
                        No change in performance detected.

netsnmp decode          time:   [10.755 µs 10.907 µs 11.052 µs]                            
                        change: [-0.6470% +5.4457% +11.619%] (p = 0.08 > 0.05)
                        No change in performance detected.

so on this single benchmark with a single packet I extracted from a packet capture from a device at work returning a BulkGetResponse, it seems rasn is about 4x slower than NetSNMP with parsing the packet

repnop · 2024-05-22T13:50:53Z

repnop
May 22, 2024
Author

me again :) been a while and finally might have some things coming up again soon-ish related to SNMP so figured I'd bump this topic a bit since I believe PER support has been added, right? do you think the library is at an appropriate point to where I could look at attempting to implement this? 👀 (and thanks for all the work you and others have put in, this library is a great resource to have around!)

6 replies

repnop May 22, 2024
Author

awesome, that's should be enough information for me to get started with something, thanks!

Nicceboy Jun 6, 2024

awesome, that's should be enough information for me to get started with something, thanks!

Hey repnop, did you get started on with this? I could also work on this right now, if you did not find enough time yet.

repnop Jun 6, 2024
Author

hey, yeah I did start this but haven't pushed any code up. I got a bit hung up on the OER and PER decodings since they don't seem particularly non-trivial and I'm not as familiar with them as I am with BER/DER, but I can probably stash my changes and todo!() those for the time being and try to get something pushed up which works with the rest of the encodings though. let me see if I have some time tomorrow to do that and I can perhaps toss up a PR so we can move the discussion to there and have it be a bit more concrete :)

Nicceboy Jun 6, 2024

Sure! I can help with the OER/PER since I have been already fighting with them some time. I actually created one PR already (#254) since I did not notice this discussion, but the design was not ideal.

repnop Jun 7, 2024
Author

I actually took another crack at it today and I apparently wasn't too far off. I've opened #256 so assuming things look good, its probably not too far off :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

librasn

Add fixed-sized integer decodings? #107

{{title}}

Replies: 2 comments 9 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

librasn

Add fixed-sized integer decodings? #107

repnop Jul 15, 2022

Replies: 2 comments · 9 replies

XAMPPRocky Jul 15, 2022 Maintainer

repnop Jul 21, 2022 Author

XAMPPRocky Jul 21, 2022 Maintainer

repnop Jul 22, 2022 Author

repnop May 22, 2024 Author

repnop May 22, 2024 Author

Nicceboy Jun 6, 2024

repnop Jun 6, 2024 Author

Nicceboy Jun 6, 2024

repnop Jun 7, 2024 Author

repnop
Jul 15, 2022

Replies: 2 comments 9 replies

XAMPPRocky
Jul 15, 2022
Maintainer

repnop Jul 21, 2022
Author

XAMPPRocky Jul 21, 2022
Maintainer

repnop Jul 22, 2022
Author

repnop
May 22, 2024
Author

repnop May 22, 2024
Author

repnop Jun 6, 2024
Author

repnop Jun 7, 2024
Author