Skip to content

Conversation

bakkot
Copy link
Member

@bakkot bakkot commented Jul 17, 2025

Pending stage 4.

@bakkot bakkot added normative change Affects behavior required to correctly evaluate some ECMAScript source text pending stage 4 This proposal has not yet achieved stage 4, but may otherwise be ready to merge. labels Jul 17, 2025
@ljharb ljharb added has test262 tests proposal This is related to a specific proposal, and will be closed/merged when the proposal reaches stage 4. labels Jul 17, 2025
@jhnaldo
Copy link
Contributor

jhnaldo commented Jul 24, 2025

I added type modeling for each separate case of TypedArray (e.g., Uint8Array) to resolve the esmeta type check error in ESMeta v0.6.4. After merging #3656, I hope this PR passes the esmeta typecheck.

@bakkot bakkot added has stage 4 This PR represents a proposal that has achieved stage 4, and is ready to merge. and removed pending stage 4 This proposal has not yet achieved stage 4, but may otherwise be ready to merge. labels Jul 28, 2025
Copy link
Contributor

@syg syg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me.

Copy link
Member

@michaelficarra michaelficarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing with normative consequences, but lots of little editorial things to fix up. I'll take another look after the current comments are addressed.

@bakkot
Copy link
Member Author

bakkot commented Aug 20, 2025

Comments addressed, thanks @michaelficarra.

spec.html Outdated
<h1>
DecodeBase64Chunk (
_chunk_: a String,
_throwOnExtraBits_: a Boolean or ~not-applicable~,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Michael's suggestion for _throwOnExtraBits_ has made this parameter pretty convoluted. I was fine with the previous optional bool, or if Michael remains feeling strongly about that being an "abuse", then my preference is something like optional _optionalBehavior_: ~throw-on-extra-bits~ instead of the current state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to put it back how it was.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want to use optionality for anything other than defaulting to something you could have passed. If we don't like bool+1, what about a 3-state enum?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two states that matter, the 3rd one is an artifact. Reflecting that state is more confusing than helpful to the reader, because they follow it all the way to realize it doesn't affect behavior.

Copy link
Member

@nicolo-ribaudo nicolo-ribaudo Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we just have a boolean there, and have a separate DecodeBase64Octet AO that does:

DecodeBase64Octet(_chunk_: a String with length 4)
1. Return ! DecodeBase64Chunk(_chunk_, *true*).
2. NOTE: The above step does not throw because a String of length 4 never has extra bits

Copy link
Member

@nicolo-ribaudo nicolo-ribaudo Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well octet isn't the right word, but probably there is a word for 24 bits. Or DecodeBase64CompleteChunk

EDIT: DecodeBase64Bidozen!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"bidozen" could be both 6 or 24 tho. I think how it was is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started doing this but ended up going a slightly different direction in f8b8442, now we have

DecodeFinalBase64Chunk (
  _chunk_: a String of length 2 or 3,
  _throwOnExtraBits_: a Boolean,
): either a normal completion containing a List of byte values, or a throw completion

and

DecodeFullLengthBase64Chunk (
  _chunk_: a String of length 4
): a List of byte values of length 3

thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

jmdyck
jmdyck previously requested changes Sep 16, 2025
spec.html Outdated
_alphabet_: *"base64"* or *"base64url"*,
_lastChunkHandling_: *"loose"*, *"strict"*, or *"stop-before-partial"*,
optional _maxLength_: a non-negative integer,
): a Record with fields [[Read]] (an integral Number), [[Bytes]] (a List of byte values), and [[Error]] (either ~none~ or a *SyntaxError* object)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
): a Record with fields [[Read]] (an integral Number), [[Bytes]] (a List of byte values), and [[Error]] (either ~none~ or a *SyntaxError* object)
): a Record with fields [[Read]] (an integer), [[Bytes]] (a List of byte values), and [[Error]] (a *SyntaxError* object or ~none~)
  • Other appearances of [[Read]] indicate that it's an integer, not an integral Number.
  • In a type disjunction, multi-valued types should precede singular values.
  • "either" is unnecessary after a left paren.

Additionally, this Record type should maybe be named and declared separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed. I don't personally think it's worth naming this type so I've left it unnamed for now.

spec.html Outdated
</dl>
<p>The <dfn id="standard-base64-alphabet">standard base64 alphabet</dfn> is *"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"*, i.e., the String whose elements are the code units corresponding to every letter and number in the Unicode Basic Latin block along with *"+"* and *"/"*.</p>
<emu-alg>
1. Let _byteSequence_ be the unique sequence of 3 bytes resulting from decoding _chunk_ as base64 (such that applying the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a> to _byteSequence_ would produce _chunk_).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, the spec uses "such that" to express a constraint, but that's presumably not what's intended here. Maybe:

Suggested change
1. Let _byteSequence_ be the unique sequence of 3 bytes resulting from decoding _chunk_ as base64 (such that applying the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a> to _byteSequence_ would produce _chunk_).
1. Let _byteSequence_ be the unique sequence of 3 bytes resulting from decoding _chunk_ as base64. (That is, applying the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a> to _byteSequence_ would produce _chunk_.)

I also tried the parenthetical as a NOTE step and as an Assert step. They work, but they don't have the same feel.

Alternatively, don't get into "decode" at all:

Suggested change
1. Let _byteSequence_ be the unique sequence of 3 bytes resulting from decoding _chunk_ as base64 (such that applying the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a> to _byteSequence_ would produce _chunk_).
1. Let _byteSequence_ be the unique sequence of 3 bytes such that applying the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a> to _byteSequence_ would produce _chunk_.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with "i.e., the sequence such that", which makes it clearly expressing a constraint but I think reads better than either of these suggestions. In particular I do want to continue to describe this as "decoding" because that is how everyone thinks of it, even though the RFC does not formally define a decoding algorithm.

Copy link

The rendered spec for this PR is available at https://tc39.es/ecma262/pr/3655.

@michaelficarra michaelficarra added the ready to merge Editors believe this PR needs no further reviews, and is ready to land. label Sep 22, 2025
@ljharb ljharb dismissed stale reviews from michaelficarra and jmdyck September 23, 2025 16:37

changes addressed

@ljharb ljharb merged commit 3dfa316 into main Sep 23, 2025
9 checks passed
@ljharb ljharb deleted the uint8array-base64 branch September 23, 2025 16:46
Jack-Works added a commit to engine262/engine262 that referenced this pull request Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

has stage 4 This PR represents a proposal that has achieved stage 4, and is ready to merge. has test262 tests normative change Affects behavior required to correctly evaluate some ECMAScript source text proposal This is related to a specific proposal, and will be closed/merged when the proposal reaches stage 4. ready to merge Editors believe this PR needs no further reviews, and is ready to land.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants