Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add draft of what layout description might look like #38

Merged
merged 4 commits into from
Dec 22, 2023
Merged

Add draft of what layout description might look like #38

merged 4 commits into from
Dec 22, 2023

Conversation

haltman-at
Copy link
Contributor

OK, I finally sat down and banged this out. Attempts to address #31. Tried to cover everything that seemed reasonable. Did this primarily in prose rather than listing out object structures and field names and such, hope that's OK. Lot of uncertainty on a number of points but I hope despite that this is concrete enough to work with -- not to use I mean, but to work on without having to figure out any more fuzzy indefinite points, that the remaining questions are contained to discrete points.

(I guess there's perhaps one exception to this, which is the question of primitive types. Oh well.)

I think this does an OK job at all three goals? There's more I would maybe have liked to included to be more general but it didn't seem practical. I don't know, I'll let you all figure out whether some version of this should be included! Primarily I hope it at least shows that the problem is (or at least may be) solvable and we don't need to exclude this as entirely unworkable. :)

@cameel
Copy link
Contributor

cameel commented Aug 23, 2023

Pinging @frangio, who was interested in this topic.

@frangio
Copy link

frangio commented Aug 23, 2023

Thanks. My interest was in making sure that the layout information of types (e.g. structs) was made available even if they are not used in state variables. In the draft in this PR I didn't see any mention of what types from the code would be included in the debug information for a contract. @haltman-at any thoughts on this?

docs/source/layout.md Outdated Show resolved Hide resolved
docs/source/layout.md Outdated Show resolved Hide resolved
but in some cases, it could conceptually be possible. The problem is that using bits instead of bytes is overall less convenient but
doesn't gain much generality. But, it does gain us one important case (regarding how strings are stored in storage in Solidity),
so we need it at least there. It seems inconsistent to use it only there and not more generally, though. So likely we should more
often be using bits instead of bytes? Something for later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string format does not really require this though. You can always look at the last bit just as a part of the length field. I.e. the length is specified as either 2N or 2N+1 and odd numbers indicate one format and even ones the other.

Comment on lines 75 to 77
Assuming any particular endianness in storage seems bad (in Solidity e.g. it's different for arrays vs bytestrings), so each type should have a storage endianness
specified -- which does not need to agree with the endianness of its component types! It covers only the outermost layer.
For something like an integer this is meaningless per se, but it is necessary to make sense of the "start" of that integer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this? How do you define endianness for arrays?


That leaves padding. We can specify this as follows:

`{ paddedBytes: number, paddingType: "zero" | "sign" | "right" }`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does bytes include paddedBytes or not?

From the note below about "bytewidth of the unpadded type" I assume it does not, but perhaps that should be said explicitly.

Comment on lines 272 to 275
### Enumerations

Maybe these are treated like primitive types? Maybe they're treated like tagged unions whose unioned types are all the unit type? In that case we'd need to be able
to represent the unit type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need specifying the size in bytes. In older Solidity versions enums took a variable number of bytes, depending on the number of members. Now they're limited to 256 members so 1 byte (ethereum/solidity#10247). Other languages could be doing it differently.

docs/source/layout.md Outdated Show resolved Hide resolved
docs/source/layout.md Outdated Show resolved Hide resolved
docs/source/layout.md Outdated Show resolved Hide resolved
Comment on lines 81 to 87
A storage slot can be specified as one of the following objects:

`{ slotType: "raw", offset: bigint }`

`{ slotType: "offset", path: Slot, offset: bigint }`

`{ slotType: "hashedoffset", path: Slot, offset: bigint }`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a distinction between relative and absolute locations? I.e. when describing the nested layout or something like a struct you might want to interpret locations as relative but then you might still want to have some things interpreted as absolute (specifically the hashed locations).

@haltman-at
Copy link
Contributor Author

Oh, geez, I just realized there's something big I left out: How things are pointed to on the stack. Actually, one could perhaps speak of cross-location pointers in general, but as that doesn't exist mostly at the moment, probably no sense in includng that; it's premature.

But, I guess something that needs to be added is, for each type, for the stack location, I talked about from/to but really we also need to say, does this thing live directly on the stack or is it pointed to. And if it's pointed to, we need to specify the pointer format -- do we just point to the start, or do we have start/length? And then if it's start/length we need to break down which part is the start and which part is the length... also, for length, we likely want to be able to specify what the length is measured in -- for instance it could potentially be "bytes" or "words" or "items".

(Yes this should be added to the PR itself but I don't have a lot of time at the moment)

@gnidan
Copy link
Member

gnidan commented Dec 22, 2023

I've rebased this against the Docusaurus stuff, now in main, and instead of leaving this PR open with the various comments/concerns, I've added all of @cameel's notes/questions into the document itself.

Merging this now as a "prototype sketch" for review soon.

@gnidan gnidan merged commit 327569f into main Dec 22, 2023
1 check passed
@gnidan gnidan deleted the layout branch December 22, 2023 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants