New Framing Should use a length based approach #1

aikar · 2011-03-16T13:33:58Z

Currently Socket.IO spec has listed as FRAMEMARKER,DATA,FRAMEMARKER

Ideally, a message frame should include the length of the data payload so that the parser will know exactly how long a message is and does not have to 'scan' for the ending marker.

so:
MARKER,LENGTHOFDATA,DATA,MARKER

so: 0xFF0xFD3,foo0xFF0xFD

Can identify that fhe data set will be a length of 3 before the end marker.

So if you receive 0xFF0xFD3,f
as a buffer you know you need 2 more characters before you should even expect to see the end marker.

This lets the parser simply check the length and not look for the end marker until it knows it has all of the data and then it knows exactly where the endmarker is.

This is of course better for performance, and very trivial to implement.

tj · 2011-03-16T15:34:30Z

+1

rauchg · 2011-03-16T16:46:30Z

This is how we are right now, I just wanted to keep it more consistent with the websocket protocol.

aikar · 2011-03-16T17:16:23Z

well I meant anywhere your doing custom framing (like HTTP connections instead of WebSocket) implemented.

On the websocket side, its of course forced and yeah we dont need double framing (someone should recommend updating the spec to be length based since weve already broke capabilities to do raw connections anyways :()

but as I read the frame identifiers in the spec, anywhere this framing is implemented outside of websocket should use length.

madari · 2011-03-16T21:24:58Z

@aikar: I didn't quite understand what you meant by your last comment.
Just out of curiosity, you suggested MARKER LENGTH DATA MARKER. Why would
you want to keep the MARKERs?

aikar · 2011-03-16T23:29:09Z

markers still help provide error testing and recovery.

Without them, you have no way to detect if the frame matches up,

With them, your not 'scanning' for the end marker but you know exactly where it SHOULD be, and doing a simple buffer[index] check to ensure it matches the marker symbol ensures the data is valid.

If you check for where a marker should be, and find it isnt, you can add error recovery to then fallback to a scan of the next start marker and continue the parse, and throw a warning/alert that data was malformed, and continue its business.

Where as w/o them, theres no way to detect a corrupted/malformed message and itll destroy the entire parser.

madari · 2011-03-17T07:03:56Z

@aikar: That's a good point from the streams point of view. Although it doesn't guarantee that the message isn't malformed/corrupted (that's what TCP does) but it does guarantee that the stream of messages is still flowing as it should be. And the overhead is negligible too. So I agree with your original suggestions: FRAMEMARKER LENGTH DATA FRAMEMARKER.

madari · 2011-03-17T08:23:20Z

Also the LENGTH should be in bytes not in utf-8 codepoints (as it has been previously). If you decide not to add LENGTH to the messages then the spec should state how to escape FRAMEMARKERs.

madari · 2011-03-17T08:29:10Z

Another variation would be FRAMEMARKER LENGTH DATA where the FRAMEMARKER would be a protocol identifier (e.g. 0xFFFD as proposed) and explictly
indicate the beginning of a frame. That would enable recovery (by always looking for the next FRAMEMARKER) and the trailing FRAMEMARKER could be dropped.

tj · 2011-03-17T16:24:31Z

@madari +1 to that suggestion, makes sense, no reason you could not recover on the frame instead of a trailing marker, probably even easier to implement this way actually

3rd-Eden · 2011-05-17T14:27:26Z

For the sake of simplified parsing across different platforms, +1 for @madari 's suggestion.

highlight @guille

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Framing Should use a length based approach #1

New Framing Should use a length based approach #1

aikar commented Mar 16, 2011

tj commented Mar 16, 2011

rauchg commented Mar 16, 2011

aikar commented Mar 16, 2011

madari commented Mar 16, 2011

aikar commented Mar 16, 2011

madari commented Mar 17, 2011

madari commented Mar 17, 2011

madari commented Mar 17, 2011

tj commented Mar 17, 2011

3rd-Eden commented May 17, 2011

New Framing Should use a length based approach #1

New Framing Should use a length based approach #1

Comments

aikar commented Mar 16, 2011

tj commented Mar 16, 2011

rauchg commented Mar 16, 2011

aikar commented Mar 16, 2011

madari commented Mar 16, 2011

aikar commented Mar 16, 2011

madari commented Mar 17, 2011

madari commented Mar 17, 2011

madari commented Mar 17, 2011

tj commented Mar 17, 2011

3rd-Eden commented May 17, 2011