-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: indexer support for cbor encoding #139
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start!
} catch (err) { | ||
debug('Error processing entries: %s', err) | ||
return { | ||
error: /** @type {const} */('ENTRIES_NOT_RETRIEVABLE'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a new error code for this error case. We were able to retrieve the entries, but we are not able to extract the payload CID from the retrieved chunk. The code ENTRIES_NOT_RETRIEVABLE
does not describe such situation correctly.
const codec = parsedCid.code | ||
|
||
switch (codec) { | ||
case 297: // DAG-JSON: https://github.com/multiformats/multicodec/blob/master/table.csv#L113 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a permalink. If a new entry is added to table.csv
, the line L113 may no longer point to DAG-JSON entry. You can create a permalink by pressing y
- GitHub web UI will change the URL to point to the commit SHA you are viewing.
I think it's not necessary to link to specific lines in the multicodec table. I would simply add a link to https://github.com/multiformats/multicodec/blob/master/table.csv before the switch statement.
// List of codecs:
// https://github.com/multiformats/multicodec/blob/master/table.csv
switch (codec) {
case 297: // DAG-JSON
case 113: // DAG-CBOR
}
I'll let you decide which approach you prefer. Please update line 364 below in sync with this line.
*/ | ||
export function processEntries (entriesCid, entriesChunk) { | ||
if (!entriesChunk.Entries || !entriesChunk.Entries.length) { | ||
throw new Error('No entries found in DAG-CBOR response') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure it's only DAG-CBOR response that can have no entries? Cannot it happen with DAG-JSON response too?
throw new Error('No entries found in DAG-CBOR response') | |
throw new Error(`No entries found in the response for ${entriesCid}`) |
// Verify the '/' property is an object with 'bytes' property | ||
// In DAG-JSON, CIDs are represented as objects with a '/' property that contains 'bytes' | ||
if (!entry['/'] || typeof entry['/'] !== 'object' || !('bytes' in entry['/'])) { | ||
throw new Error('DAG-JSON entry\'s "/" property must be a CID object with a bytes property') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, IPNI entries are multihashes, not CIDs. (That's why we have to convert them to CIDs by adding a Raw codec.) In that case, the error message is not correct.
throw new Error('DAG-JSON entry\'s "/" property must be a CID object with a bytes property') | |
throw new Error('DAG-JSON entry\'s "/" property must be an object with a "bytes" property') |
} | ||
|
||
const entryHash = entry['/'].bytes | ||
entryBytes = Buffer.from(String(entryHash), 'base64') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to convert entryHash
to a String? I believe it is already a string.
The original version was using this conversion:
Buffer.from(entryHash, 'base64')
What am I missing?
If you want to better handle non-string bytes
values, then we should reject them.
if (typeof entryHash !== 'string') {
throw new Error('DAG-JSON entry\'s ["/"]["bytes"] property must be a string')
}
That should satisfy the TypeScript checker to allow you to call Buffer.from(entryHash)
.
// Restore the original fetch function after each test | ||
globalThis.fetch = originalFetch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not mock the global fetch function and instead inject it to fetchCid
function. It's a pattern we already use in other repositories.
The trick is to define the fetch
parameter as optional and defaulting to the global fetch function.
export async function fetchCid (providerBaseUrl, cid, { fetchTimeout, fetch } = { globalThis.fetch }) {
// ...
}
// in tests
const fetch = async () => {
ok: true,
status: 200,
json: () => Promise.resolve(jsonResponse),
arrayBuffer: () => { throw new Error('Should not call arrayBuffer for JSON') }
})
})
const result = await fetchCid('http://example.com', dagJsonCid, { fetch })
// ...
Also notice that the following line can be simplified:
let fn
fn = () => { return Promise.resolve(/* result */) }
// is the same as
fn = () => Promise.resolve(/* result */)
// can be also written as
fn = async () => /* result */
}) | ||
|
||
it('uses DAG-CBOR codec (0x71) to parse response as CBOR', async () => { | ||
const cborData = cbor.encode(jsonResponse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unexpected - jsonResponse encoded using cbor encoding. I was expecting to see cbor.encode(cborResponse)
.
After reading the rest of the test, I see that it does not matter what is the structure of the response body. Let's rename jsonResponse
to something like someResponse
or testResponse
to make it more clear this object is not related to whether we use JSON or CBOR encoding.
const unknownCodecCid = 'bafkreigrnnl64xuevvkhknbhrcqzbdvvmqnchp7ae2a4ulninsjoc5svoq' | ||
const parsedCid = CID.parse(unknownCodecCid) | ||
assert.strictEqual(parsedCid.code, 85) | ||
const errorMessage = 'To parse non base32, base36 or base58btc encoded CID multibase decoder must be provided' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about this error message. The CID in unknownCodecCid is encoded using base32 as you can verify here: https://cid.ipfs.tech/#bafkreigrnnl64xuevvkhknbhrcqzbdvvmqnchp7ae2a4ulninsjoc5svoq
I am expecting an error message telling us that we encountered an unsupported codec 85.
Perhaps the test is triggering a different error path than you thought?
// @ts-ignore | ||
const result = await pRetry( | ||
() => | ||
( | ||
fetchCid(curioProviderUrl, dagCborCid) | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to disable type-checking for this statement?
Let's try to find a way how to tell TypeScript what we are trying to accomplish.
// @ts-ignore | ||
const entriesChunk = await pRetry( | ||
() => | ||
( | ||
fetchCid(curioProviderUrl, entriesCid) | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
Add support for DAG-CBOR encoded entries
This PR adds support for DAG-CBOR encoded entries (CIDs starting with "bafy", codec 0x71) in addition to the existing DAG-JSON format. This enables the piece-indexer to process advertisements from providers like Curio that use DAG-CBOR encoding.
The implementation:
Closes #119
Implementation Plan: #119 (comment)
Parent Issue: #132