Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC for buffer varint encoding #12

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

bmcq-0
Copy link

@bmcq-0 bmcq-0 commented Dec 1, 2023

buffer.readleb128(b: buffer, offset: number): number
buffer.readuleb128(b: buffer, offset: number): number

buffer.readleb128(b: buffer, offset: number, value: number): ()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick, this should be buffer.writeleb128

@WheretIB
Copy link

WheretIB commented Dec 8, 2023

What's confusing about these functions is that you have no idea how many bytes were written or how many were read.
So if you have multiple varints to read, you don't know at which offsets each one of them is placed.

@bmcq-0
Copy link
Author

bmcq-0 commented Dec 8, 2023

What's confusing about these functions is that you have no idea how many bytes were written or how many were read. So if you have multiple varints to read, you don't know at which offsets each one of them is placed.

I think simply having the count of written bytes returned works. The read functions could return 2 values (value, count) while the write functions return only the count.

The following are implementations of ULEB-128 reading/writing in Luau:

```lua
local function writeuleb128(stream, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the implementation examples to match what the proposed functions will do with a library implementation.
Right now they don't compile, incorrectly use the buffer library functions and don't return the proposed results.
It would be nice to also convert the argument value to integer number using bit32 library.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmcq-0 You can update your read implementation with this, adjust accordingly.

	local function readByte()
		local byte = buffer_readu8(stream, position)
		position = position + 1
		return byte
	end

	local function readVarInt()
		local result = 0

		for i = 0, 4 do
			local value = readByte()
			result = bit32.bor(result, bit32.lshift(bit32.band(value, 0x7F), i * 7))
			if not bit32.btest(value, 0x80) then
				break
			end
		end

		return result
	end

@vegorov-rbx
Copy link
Contributor

From the support and not that many problems being reported, we are likely to move forward with this proposal (unless new issues are raised).
Having said that, we need to work to make the specification clean and precise.

@Dekkonot
Copy link
Contributor

It's unclear to me, from reading the RFC, what the arguments to each function do. I understand intuitively because I'm a clever guy, but it feels like a bad idea to merge an RFC where intuitive understanding is a requirement.

@bmcq-0
Copy link
Author

bmcq-0 commented Jan 30, 2024

It's unclear to me, from reading the RFC, what the arguments to each function do.

The arguments are the exact same as other read/write functions in the buffer library. Do you mean the return values?

@Dekkonot
Copy link
Contributor

It's unclear to me, from reading the RFC, what the arguments to each function do.

The arguments are the exact same as other read/write functions in the buffer library. Do you mean the return values?

There's a certain level of irony to me calling myself clever in the same message I made that level of mistake, but that is what I meant yes.

To be more specific since that last comment was poorly written: this RFC assumes prior knowledge of LEB128, which is probably a bad thing. A quick rundown of what it is and why you would use it over any other form of variable-length format would be good to include.

@sircfenner
Copy link

sircfenner commented Feb 9, 2024

The number of bytes used needs to be known before calling the write* functions because you need to know if the varint will fit within the buffer. Every user would have to implement this non-trivial calculation by themselves, which undermines the three main justifications in the proposal (speed, convenience, and not having to know the specifics of LEB128).

By comparison, with all other existing buffer.write* functions it is trivial to calculate the amount of space required to write the relevant value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants