Provide an efficient way to parse Integers (and Floats)?

Previous discussion: https://bugs.ruby-lang.org/issues/20394

### Context

When trying to write pure Ruby gems that are competitive in term of performance with C extensions, a very common bottleneck is parsing of text based protocols and formats, such as the [Redis RESP protocol](https://github.com/antirez/RESP3/blob/master/spec.md), or even [the PDF format](https://github.com/gettalong/hexapdf/blob/master/lib/hexapdf/tokenizer.rb#L287-L288) (FYI @gettalong).

As a result, currently [the most efficient way to parse integers in a string in Ruby, is to reimplement `atoi` using `String#getbyte`](https://github.com/redis-rb/redis-client/commit/41b3abe94243d2598211d448c4e457a3585ff9d5), which is a bit ridiculous.

Otherwise if you create a substring with `String#slice` or `StringScanner#scan` and then call `to_i` or `Integer`, instantiating the sub string and copying the bytes really tank the performance.

### Proposal

Given that `StringScanner` is a default gem, is often involved in string parsing, and already act as a "pointer into a String", I think it's well positioned to offer an efficient way to parse an Integer without instantiating a useless temporary string.

Basically an optimized way to do `scanner.scan(/\d+/).to_i`.

The API could be any of:

  - `scanner.scan(/\d+/, :to_i)`
  - `scanner.scan(/\d+/, Integer)`
  - `scanner.scan_integer(/\d+/)`

Logically the two supported types would be `Integer` and `Float`, but perhaps others would be helpful for other protocols?

@kou as maintainer of `strscan`, do you have any opinion? I'm happy to put the work on this, but I'd need to know if the feature is desired, and which API would be deemed acceptable.

Also cc @tenderlove @mame from previous discussions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Provide an efficient way to parse Integers (and Floats)? #113

Context

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Provide an efficient way to parse Integers (and Floats)? #113

Description

Context

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions