Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect use of value.length in insertStringNT and writeStringNT when value is non-ascii strings or encoding is UTF-16 #50

Open
anonghuser opened this issue Oct 16, 2023 · 0 comments

Comments

@anonghuser
Copy link

anonghuser commented Oct 16, 2023

The writeStringNT code when not given an offset argument is correct as it uses the writeOffset for placing the null terminator, but with an offset it is incorrect. The insertStringNT is always incorrect.
You will have to switch the use of value.length for Buffer.byteLength() in those two places.

console.log(new SB().insertString('я\0', 0).toString('hex')); // d18f00

console.log(new SB().writeStringNT('я').toString('hex'));     // d18f00
console.log(new SB().writeStringNT('я', 0).toString('hex'));  // d100
console.log(new SB().insertStringNT('я', 0).toString('hex')); // d1008f

Your implementation could be simplified by just appending '\0' to the string and passing it to writeString/insertString. There will be no functional difference in all encodings except UTF-16, where the null terminator will become two-byte long:

console.log(new SB().writeString('abcd\0', 'utf16le').toString('hex'));      // 61006200630064000000

console.log(new SB().writeStringNT('abcd', 'utf16le').toString('hex'));      // 610062006300640000
console.log(new SB().writeStringNT('abcd', 0, 'utf16le').toString('hex'));   // 6100620000006400
console.log(new SB().insertStringNT('abcd', 0, 'utf16le').toString('hex'));  // 610062000063006400

If you go with this approach, you will also have to special-case the readStringNT code for UTF-16 (and all its aliases, to avoid hardcoding them you might check the length of an encoded '\0') to look for the even-aligned two-byte null, but that is incorrect currently anyway: #23

I guess there may be an argument for the Buffer.byteLength() approach (while perhaps still changing the terminator to two-byte in utf16) to avoid temporary string values with the concatenation approach hurting performance, but I can't imagine how long the strings would need to be for this to become noticeable... Probably a non-issue.

@anonghuser anonghuser changed the title Incorrect use of value.length in insertStringNT and writeStringNT when value is non-ascii strings Incorrect use of value.length in insertStringNT and writeStringNT when value is non-ascii strings or encoding is UTF-16 Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant