Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: data tables #13

Open
lmittmann opened this issue Jan 13, 2025 · 6 comments
Open

Q: data tables #13

lmittmann opened this issue Jan 13, 2025 · 6 comments

Comments

@lmittmann
Copy link

Is there an equivalent to huff's data tables in geas?

You have this CODECOPY example https://github.com/fjl/geas?tab=readme-ov-file#assemble, but is there any way to put a raw byte sequence between labels? Something like:

.data_start:
    0x12345678 ;; raw data for use with CODECOPY
.data_end:
@fjl
Copy link
Owner

fjl commented Jan 13, 2025

This is basically issue #7. I have some ideas for syntax to that end. I'm curious what you want to use it for. Is it just for strings, do you want to include external files?

@lmittmann
Copy link
Author

I primarily need this to build larger calldata, where some words are static. e.g. balanceOf(constant address). Some additional nice properties would be to be in control of the placement of the raw bytecode.

@lmittmann
Copy link
Author

A possible design I thought about would be a function .raw that can optionally be named.

;; just raw data:
.raw(0x12345678) 

;; or labeled raw data:
.my_data: .raw(0x12345678)

And than the raw bytes could be accessed using:

push @my_data        ;; size
push .size(@my_data) ;; offset size
push 0               ;; destOffset offset size
codecopy

The .size function could be avoided with another .data_end label, but that would not be very ergonomic.

@fjl
Copy link
Owner

fjl commented Jan 14, 2025

Hmm. That doesn't really work because the design of expression macros has a few constraints.

  • macro calls can only be used where an expression is expected: as a PUSH argument or in the #define of another macro
  • result values of macros are bigints, so it's a bit weird to use them for binary data

I think we should go ahead and implement my proposal from #7, with a directive to include raw bytes and the literal syntax:

#bytecode {
    0x01020304       ;; hex bytes supported using number literal
    "string"         ;; can use string literals for text
    4: myMacro()     ;; numeric label defines byte size of following expression
}

This syntax can also be used for jump tables:

.jt: #bytecode {
    2: @targetOne
    2: @targetTwo
    2: @targetThree
}

    ;; here we use 'value' from the stack to determine the jump location
    ;; by offsetting into the table
    dup1                  ; [value, value]
    push 2                ; [2, value, value]
    lt                    ; [2<value, value]
    jumpi @outOfBounds    ; [value]
    push @.jt             ; [offset, value]
    add                   ; [offset]
    push 2                ; [size, offset]
    swap1                 ; [offset, size]
    push 0                ; [dstOffset, offset, size]
    codecopy              ; []
    push 0                ; [offset]  
    mload                 ; [word]
    push 0xffff<<(256-16) ; [mask, word]
    and                   ; [label]
    jump                  ; []

@lmittmann
Copy link
Author

Using string literals in ''#bytecode" would be nice. Although it might be a bit complicated to get the logic right. e.g. should they be abi encoded? Maybe another way would be to put this functionality in a builtin-function like .selector. This would give some more flexibility as their could be one function to abi-encode and another to abi-encode-packed.

What is a usecase for macors in #bytecode?

jumptables are gas inefficient. I wouldn't bother.

@fjl
Copy link
Owner

fjl commented Jan 14, 2025

Built-in ABI encoding is not a goal for me personally. The bytes syntax should just be for writing out raw bytes. We can add a special macro to encode basic types maybe. But I don't want to create an encoder for arbitrary data types. And there are no arrays/lists/structs in the geas typing model anyway, so you wouldn't be able to express complex objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants