-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Framing Should use a length based approach #1
Comments
+1 |
This is how we are right now, I just wanted to keep it more consistent with the websocket protocol. |
well I meant anywhere your doing custom framing (like HTTP connections instead of WebSocket) implemented. On the websocket side, its of course forced and yeah we dont need double framing (someone should recommend updating the spec to be length based since weve already broke capabilities to do raw connections anyways :() but as I read the frame identifiers in the spec, anywhere this framing is implemented outside of websocket should use length. |
@aikar: I didn't quite understand what you meant by your last comment. |
markers still help provide error testing and recovery. Without them, you have no way to detect if the frame matches up, With them, your not 'scanning' for the end marker but you know exactly where it SHOULD be, and doing a simple buffer[index] check to ensure it matches the marker symbol ensures the data is valid. If you check for where a marker should be, and find it isnt, you can add error recovery to then fallback to a scan of the next start marker and continue the parse, and throw a warning/alert that data was malformed, and continue its business. Where as w/o them, theres no way to detect a corrupted/malformed message and itll destroy the entire parser. |
@aikar: That's a good point from the streams point of view. Although it doesn't guarantee that the message isn't malformed/corrupted (that's what TCP does) but it does guarantee that the stream of messages is still flowing as it should be. And the overhead is negligible too. So I agree with your original suggestions: FRAMEMARKER LENGTH DATA FRAMEMARKER. |
Also the LENGTH should be in bytes not in utf-8 codepoints (as it has been previously). If you decide not to add LENGTH to the messages then the spec should state how to escape FRAMEMARKERs. |
Another variation would be FRAMEMARKER LENGTH DATA where the FRAMEMARKER would be a protocol identifier (e.g. 0xFFFD as proposed) and explictly |
@madari +1 to that suggestion, makes sense, no reason you could not recover on the frame instead of a trailing marker, probably even easier to implement this way actually |
Currently Socket.IO spec has listed as FRAMEMARKER,DATA,FRAMEMARKER
Ideally, a message frame should include the length of the data payload so that the parser will know exactly how long a message is and does not have to 'scan' for the ending marker.
so:
MARKER,LENGTHOFDATA,DATA,MARKER
so: 0xFF0xFD3,foo0xFF0xFD
Can identify that fhe data set will be a length of 3 before the end marker.
So if you receive 0xFF0xFD3,f
as a buffer you know you need 2 more characters before you should even expect to see the end marker.
This lets the parser simply check the length and not look for the end marker until it knows it has all of the data and then it knows exactly where the endmarker is.
This is of course better for performance, and very trivial to implement.
The text was updated successfully, but these errors were encountered: