feat(PE-6695/PE-6696): remove resolver, add redis support for arns cache #200

dtfiedler · 2024-09-09T22:21:50Z

With on-demand resolution, we no longer need to support the arns-resolver. We can effectively fetch and cache arns resolutions quickly via AO and gateways. In addition to removing the standalone-arns-resolver implementation and service from docker, this PR replaces the default MemoryArNSCache with a configurable KvBufferStore that supports redis or a local node-cache. When resolving an arns name, the CompositeArNSResolver will check the provided cache and the TTL of the record, if it is not in the cache it will then use the available arns resolvers (on-demand and/or another gateway) to get resolution data. If an operator would like to disable caching of arns names, and always resolve to the latest they can set ARNS_CACHE_TTL_SECONDS to 0.

Additionally, prometheus metrics are available for hit/miss rate for the arns cache and total resolution times.

Logs

ore-1  | { event: 'up', message: 'applied 0 migrations.' }
core-1  | info: Using redis for KVBufferStore for partial-blocks {"timestamp":"2024-09-09T21:43:01.559Z"}
core-1  | info: Using redis for KVBufferStore for partial-txs {"timestamp":"2024-09-09T21:43:01.596Z"}
core-1  | info: Using redis as KVBufferStore for arns {"maxKeys":10000,"redisUrl":"redis://redis:6379","timestamp":"2024-09-09T21:43:01.649Z","ttlSeconds":3600,"type":"redis"}
core-1  | info: Using on-demand,gateway for arns name resolution {"timestamp":"2024-09-09T21:43:01.653Z"}
core-1  | info: WebhookEmitter not initialized. No WEBHOOK_TARGET_SERVERS are set. {"class":"WebhookEmitter","timestamp":"2024-09-09T21:43:01.656Z"}
core-1  | info: Using redis for KVBufferStore for signatures {"timestamp":"2024-09-09T21:43:01.656Z"}
core-1  | info: Starting worker {"class":"FsCleanupWorker","timestamp":"2024-09-09T21:43:02.492Z"}
core-1  | info: Starting worker {"class":"FsCleanupWorker","timestamp":"2024-09-09T21:43:02.493Z"}
core-1  | info: Listening on port 4000 {"timestamp":"2024-09-09T21:43:02.733Z"}
core-1  | info: No more files to delete, restarting from the base path {"class":"FsCleanupWorker","timestamp":"2024-09-09T21:43:02.765Z"}
core-1  | info: Deleting 3 files in data/contiguous} {"class":"FsCleanupWorker","timestamp":"2024-09-09T21:43:03.023Z"}
core-1  | info: No more files to delete, restarting from the base path {"class":"FsCleanupWorker","timestamp":"2024-09-09T21:43:08.030Z"}
core-1  | info: Resolving name... {"class":"CompositeArNSResolver","name":"ao","timestamp":"2024-09-09T21:43:23.552Z"}
core-1  | info: Cache hit for arns name {"class":"CompositeArNSResolver","name":"ao","timestamp":"2024-09-09T21:43:23.554Z"}
core-1  | info: Resolving manifest path from index... {"class":"StreamingManifestPathResolver","id":"hviIKJhAIow-Ouy4PJg12HA7Hbi4rYCdnEae3NxpZh8","path":"","timestamp":"2024-09-09T21:43:23.565Z"}
core-1  | warn: Unable to resolve manifest path from index: not implemented {"class":"StreamingManifestPathResolver","timestamp":"2024-09-09T21:43:23.565Z"}
core-1  | info: Resolving manifest path from data... {"class":"StreamingManifestPathResolver","id":"hviIKJhAIow-Ouy4PJg12HA7Hbi4rYCdnEae3NxpZh8","path":"","timestamp":"2024-09-09T21:43:23.568Z"}
core-1  | info: Resolved manifest path from data {"class":"StreamingManifestPathResolver","id":"hviIKJhAIow-Ouy4PJg12HA7Hbi4rYCdnEae3NxpZh8","path":"","resolvedId":"pBwuC6ChLLJHGXV_7WEfB5rgfjh6YsPqEIIy0_qDNGY","timestamp":"2024-09-09T21:43:23.589Z"}
core-1  | info: Resolving name... {"class":"CompositeArNSResolver","name":"permagate","timestamp":"2024-09-09T21:43:34.060Z"}
core-1  | info: Cache miss for arns name {"class":"CompositeArNSResolver","name":"permagate","timestamp":"2024-09-09T21:43:34.061Z"}
core-1  | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"permagate","timestamp":"2024-09-09T21:43:34.061Z","type":"OnDemandArNSResolver"}
core-1  | info: Resolving name... {"class":"OnDemandArNSResolver","name":"permagate","timestamp":"2024-09-09T21:43:34.062Z"}
core-1  | info: Resolved name {"class":"CompositeArNSResolver","name":"permagate","resolution":{"name":"permagate","processId":"GQvFTAAi-oZn9f0txkajnVU18hlJKeOI2eqYOv3M8cg","resolvedAt":1725918215268,"resolvedId":"eUHEOYu1TjVk8CuQYzhQ9_jevCbJjihDrf3IJ-XvadQ","ttl":3600},"timestamp":"2024-09-09T21:43:35.269Z"}
core-1  | info: Resolving name... {"class":"CompositeArNSResolver","name":"permagate","timestamp":"2024-09-09T21:43:45.226Z"}
core-1  | info: Cache hit for arns name {"class":"CompositeArNSResolver","name":"permagate","timestamp":"2024-09-09T21:43:45.227Z"}
core-1  | info: Resolving name... {"class":"CompositeArNSResolver","name":"aolink","timestamp":"2024-09-09T21:43:53.430Z"}
core-1  | info: Cache miss for arns name {"class":"CompositeArNSResolver","name":"aolink","timestamp":"2024-09-09T21:43:53.431Z"}
core-1  | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"aolink","timestamp":"2024-09-09T21:43:53.431Z","type":"OnDemandArNSResolver"}
core-1  | info: Resolving name... {"class":"OnDemandArNSResolver","name":"aolink","timestamp":"2024-09-09T21:43:53.432Z"}
core-1  | info: Resolved name {"class":"CompositeArNSResolver","name":"aolink","resolution":{"name":"aolink","processId":"NwutOidcJmE08SMFfyL0BLDcE4mBYJt1roifbmhGTMk","resolvedAt":1725918235751,"resolvedId":"1kJc3Y5lPMXqYuHFsYW0B2-zc0VT3IETq_jkTBbX6CI","ttl":3600},"timestamp":"2024-09-09T21:43:55.752Z"}

Metrics

# HELP arns_cache_hit_total Number of hits in the arns cache
# TYPE arns_cache_hit_total counter
arns_cache_hit_total 4

# HELP arns_cache_miss_total Number of misses in the arns cache
# TYPE arns_cache_miss_total counter
arns_cache_miss_total 2

# HELP arns_resolution_time_ms Time it takes to resolve an arns name
# TYPE arns_resolution_time_ms summary
arns_resolution_time_ms{quantile="0.01"} 1
arns_resolution_time_ms{quantile="0.05"} 1
arns_resolution_time_ms{quantile="0.5"} 2.5
arns_resolution_time_ms{quantile="0.9"} 2211.6000000000004
arns_resolution_time_ms{quantile="0.95"} 2323
arns_resolution_time_ms{quantile="0.99"} 2323
arns_resolution_time_ms{quantile="0.999"} 2323
arns_resolution_time_ms_sum 3538
arns_resolution_time_ms_count 6

codecov · 2024-09-09T22:22:28Z

Codecov Report

Attention: Patch coverage is 95.65217% with 2 lines in your changes missing coverage. Please review.

Project coverage is 70.16%. Comparing base (bbf79b1) to head (b9e4fe6).
Report is 11 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/store/lmdb-kv-store.ts	50.00%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #200      +/-   ##
===========================================
+ Coverage    68.53%   70.16%   +1.63%     
===========================================
  Files           32       32              
  Lines         7795     7829      +34     
  Branches       438      438              
===========================================
+ Hits          5342     5493     +151     
+ Misses        2452     2336     -116     
+ Partials         1        0       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dtfiedler · 2024-09-09T22:23:52Z

docker-compose.yaml

-  resolver:
-    image: ghcr.io/ar-io/arns-resolver:${RESOLVER_IMAGE_TAG:-7fe02ecda2027e504248d3f3716579f60b561de5}
-    restart: on-failure
-    ports:
-      - 6000:6000
-    environment:
-      - PORT=6000
-      - LOG_LEVEL=${LOG_LEVEL:-info}
-      - IO_PROCESS_ID=${IO_PROCESS_ID:-}
-      - RUN_RESOLVER=${RUN_RESOLVER:-false}
-      - EVALUATION_INTERVAL_MS=${EVALUATION_INTERVAL_MS:-}
-      - ARNS_CACHE_TTL_MS=${RESOLVER_CACHE_TTL_MS:-}
-      - ARNS_CACHE_PATH=${ARNS_CACHE_PATH:-./data/arns}
-      - AO_CU_URL=${AO_CU_URL:-}
-      - AO_MU_URL=${AO_MU_URL:-}
-      - AO_GATEWAY_URL=${AO_GATEWAY_URL:-}
-      - AO_GRAPHQL_URL=${AO_GRAPHQL_URL:-}
-    volumes:
-      - ${ARNS_CACHE_PATH:-./data/arns}:/app/data/arns
-    networks:
-      - ar-io-network


With on-demand resolution, we no longer need to support the arns-resolver. We can effectively fetch and cache arns resolutions quickly via AO and cache them locally or in redis. This replaces the default `MemoryArNSCache` with a configurable KvBufferStore that supports `redis` or a local `node-cache`. When resolving an arns name, the `CompositeArNSResolver` will check the provided cache and the TTL of the record, if it is not in the cache it will then use the available arns resolvers (on-demand and/or another gateway) to get resolution data. If an operator would like to disable caching of arns names, and always resolve to the latest they can set ARNS_CACHE_TTL_SECONDS to 0. Additionally, prometheus metrics are available for hit/miss rate for the arns cache and total resolution times.

…or arns in cache This is to avoid doing it in the composite resolver

coderabbitai · 2024-09-09T22:41:32Z

Walkthrough

The changes involve a comprehensive reconfiguration of the ARNS resolution system, including the removal of the resolver service and the introduction of a new caching mechanism through KvArnsStore. The environment variable management has been updated with the addition of ARNS_CACHE_TYPE and the removal of deprecated variables. New metrics for monitoring ARNS cache performance have been introduced, and various classes have been enhanced to support caching and improved connection management. These modifications indicate a shift towards a more modular and efficient architecture for ARNS resolution.

Changes

Files	Change Summary
`docker-compose.yaml`	Removed `resolver` service; added `ARNS_CACHE_TYPE` environment variable.
`src/config.ts`	Added `ARNS_CACHE_TYPE`; modified `ARNS_RESOLVER_PRIORITY_ORDER`; removed deprecated constants.
`src/init/resolvers.ts`	Added `createArNSKvStore` function; modified `createArNSResolver` to include a caching parameter.
`src/metrics.ts`	Added metrics for ARNs cache hits, misses, and resolution time.
`src/middleware/arns.ts`	Enhanced `createArnsMiddleware` with caching for ARNs resolution promises and performance metrics.
`src/resolution/composite-arns-resolver.ts`	Added caching mechanism to `CompositeArNSResolver`; modified resolution logic to utilize cache.
`src/resolution/on-demand-arns-resolver.ts`	Updated `OnDemandArNSResolver` to integrate with `AoClient` and optimize connection management.
`src/store/fs-kv-store.ts`	Added `close` method to `FsKVStore` class.
`src/store/kv-arns-store.ts`	Introduced `KvArnsStore` class for ARNs key-value storage with prefixed keys.
`src/store/lmdb-kv-store.ts`	Added `close` method to `LmdbKVStore` class.
`src/store/node-kv-store.ts`	Introduced `NodeKvStore` class for caching `Buffer` objects with TTL and max key limits.
`src/store/redis-kv-store.ts`	Added `close` method to `RedisKvStore` class for safe connection closure.
`src/system.ts`	Replaced `MemoryCacheArNSResolver` with `KvArnsStore`; redefined `nameResolver` and added cache cleanup.
`src/types.d.ts`	Added `close()` method to `KVBufferStore` type.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ARNSResolver
    participant Cache
    participant DataSource

    Client->>ARNSResolver: Request ARNS resolution
    ARNSResolver->>Cache: Check for cached result
    alt Cache hit
        Cache-->>ARNSResolver: Return cached result
    else Cache miss
        ARNSResolver->>DataSource: Resolve ARNS
        DataSource-->>ARNSResolver: Return resolved ARNS
        ARNSResolver->>Cache: Store result in cache
    end
    ARNSResolver-->>Client: Return ARNS resolution result

Possibly related PRs

feat(arns): allow for on-demand name resolution, introduce composite arns resolver with fallbacks #189: Changes in this PR involve modifications to the src/init/resolvers.ts file, directly relating to the new caching mechanisms and resolver configurations introduced in the main PR, particularly regarding ARNS resolution and caching strategies.

Tip

Announcements

The review status is no longer posted as a separate comment when there are no actionable or nitpick comments. In such cases, the review status is included in the walkthrough comment.
We have updated our review workflow to use the Anthropic's Claude family of models. Please share any feedback in the discussion post on our Discord.
Possibly related PRs: Walkthrough comment now includes a list of potentially related PRs to help you recall past context. Please share any feedback in the discussion post on our Discord.
Suggested labels: CodeRabbit can now suggest labels by learning from your past PRs in the walkthrough comment. You can also provide custom labeling instructions in the UI or configuration file.
Possibly related PRs, automatic label suggestions based on past PRs, learnings, and possibly related issues require data opt-in (enabled by default).

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between b8afa1c and b9e4fe6.

Files selected for processing (1)

src/resolution/composite-arns-resolver.ts (1 hunks)

Files skipped from review as they are similar to previous changes (1)

src/resolution/composite-arns-resolver.ts

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

karlprieb · 2024-09-10T12:30:56Z

src/resolution/composite-arns-resolver.ts

+          });
+          const resolution = await resolver.resolve(name);
+          if (resolution.resolvedId !== undefined) {
+            await this.cache.set(name, Buffer.from(JSON.stringify(resolution)));


I was thinking about the need to await the cache set here. For other caches we generally don't wait for it to be set.
In this case, I imagine if we call resolve for the same name several times (while the first is being cached) the sequential calls, while the name is not cached yet, will not wait for the cache, right? If so I don't think we should wait for the cache to be set.

What do you think?

good call out, removed in f1bd0cf -

when thinking about the implications of this, i also realized we aren't handling concurrent requests well. I'd like to propose adding a request cache in the arns middleware that protects against concurrent calls to nameResolver.resolve(name) by storing the promise in a NodeCache

// check if this instance is already in the process of resolving the requested name, and return that promise if so, otherwise set it in the cache const getArnsResolutionPromise = async (): Promise<NameResolution> => { if (arnsRequestCache.has(arnsSubdomain)) { const arnsResolutionPromise = arnsRequestCache.get<Promise<NameResolution>>(arnsSubdomain); if (arnsResolutionPromise) { return arnsResolutionPromise; } } const arnsResolutionPromise = nameResolver.resolve(arnsSubdomain); arnsRequestCache.set(arnsSubdomain, arnsResolutionPromise); return arnsResolutionPromise; }; const start = Date.now(); const { resolvedId, ttl, processId } = await getArnsResolutionPromise().finally(() => { // remove from cache after resolution arnsRequestCache.del(arnsSubdomain); }); metrics.arnsResolutionTime.observe(Date.now() - start); if (resolvedId === undefined) { sendNotFound(res); return; } res.header(headerNames.arnsResolvedId, resolvedId); res.header(headerNames.arnsTtlSeconds, ttl.toString()); res.header(headerNames.arnsProcessId, processId); // TODO: add a header for arns cache status res.header('Cache-Control', `public, max-age=${ttl}`); dataHandler(req, res, next);

After adding this change, I did some tests using hey.

Test: 100 concurrent requests for an arns name

Before:

Each arns request created independent promises to AO, resulting in a wide range of resolution times and some experiencing rate limiting/throttling having to fallback to TrustedArNSGateway resolvers.

❯ hey -n 100 -c 100 -t 60 -host 'gateways.example.com' http://localhost:4000 Summary: Total: 65.3634 secs Slowest: 65.3604 secs Fastest: 17.1049 secs Average: 31.1251 secs Requests/sec: 1.5299 Total data: 172414200 bytes Size/request: 1724142 bytes Response time histogram: 17.105 [1] |■ 21.930 [71] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 26.756 [3] |■■ 31.582 [0] | 36.407 [0] | 41.233 [0] | 46.058 [0] | 50.884 [0] | 55.709 [0] | 60.535 [6] |■■■ 65.360 [19] |■■■■■■■■■■■ Latency distribution: 10% in 19.6950 secs 25% in 20.5840 secs 50% in 21.3610 secs 75% in 59.9175 secs 90% in 63.3682 secs 95% in 64.5496 secs 99% in 65.3604 secs Details (average, fastest, slowest): DNS+dialup: 0.0074 secs, 17.1049 secs, 65.3604 secs DNS-lookup: 0.0026 secs, 0.0016 secs, 0.0042 secs req write: 0.0002 secs, 0.0000 secs, 0.0020 secs resp wait: 27.6524 secs, 15.5326 secs, 65.3426 secs resp read: 3.4650 secs, 0.0111 secs, 6.5513 secs Status code distribution: [200] 100 responses

Logs - these are printed 100 times (one for each request) and several experience rate limits from AO infrastructure, forcing the CompositeArNSResolver to fallback to gateways to get resolution data

core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.052Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.052Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.054Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.054Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.055Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.056Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.056Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.057Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.060Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.060Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.060Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.064Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.065Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.065Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.069Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.069Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.069Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.074Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.074Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.074Z"} core-1 | warn: Unable to resolve name: fetch failed {"class":"OnDemandArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.078Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.078Z","type":"TrustedGatewayArNSResolver"} core-1 | info: Resolving name... {"class":"TrustedGatewayArNSResolver","name":"joose","timestamp":"2024-09-10T14:19:59.078Z"}

After

All 100 requests were satisfied with the single request to AO, and resolution times were approx. the same

❯ hey -n 100 -c 100 -t 60 -host 'gateways.example.com' http://localhost:4000 Summary: Total: 44.0368 secs Slowest: 44.0354 secs Fastest: 43.6696 secs Average: 43.8166 secs Requests/sec: 2.2708 Total data: 172414200 bytes Size/request: 1724142 bytes Response time histogram: 43.670 [1] |■ 43.706 [0] | 43.743 [8] |■■■■■■■■■■ 43.779 [22] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 43.816 [31] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 43.853 [10] |■■■■■■■■■■■■■ 43.889 [17] |■■■■■■■■■■■■■■■■■■■■■■ 43.926 [3] |■■■■ 43.962 [4] |■■■■■ 43.999 [3] |■■■■ 44.035 [1] |■ Latency distribution: 10% in 43.7457 secs 25% in 43.7732 secs 50% in 43.8027 secs 75% in 43.8615 secs 90% in 43.9001 secs 95% in 43.9574 secs 99% in 44.0354 secs Details (average, fastest, slowest): DNS+dialup: 0.0091 secs, 43.6696 secs, 44.0354 secs DNS-lookup: 0.0021 secs, 0.0015 secs, 0.0029 secs req write: 0.0002 secs, 0.0000 secs, 0.0016 secs resp wait: 42.9208 secs, 42.9030 secs, 42.9412 secs resp read: 0.8865 secs, 0.7390 secs, 1.0992 secs Status code distribution: [200] 100 responses

Logs - only printed once for all 100 requests, no falling back to gateway resolver necessary

core-1 | info: Cache miss for arns name {"class":"CompositeArNSResolver","name":"gateways","timestamp":"2024-09-10T13:34:09.999Z"} core-1 | info: Attempting to resolve name with resolver {"class":"CompositeArNSResolver","name":"gateways","timestamp":"2024-09-10T13:34:09.999Z","type":"OnDemandArNSResolver"} core-1 | info: Resolving name... {"class":"OnDemandArNSResolver","name":"gateways","timestamp":"2024-09-10T13:34:10.000Z"}

This should also help slamming AO on fresh or expired names and likely a pattern would could extend to other request paths.

cc @djwhitt

karlprieb · 2024-09-10T12:34:59Z

src/system.ts

@@ -616,6 +623,7 @@ export const shutdown = async (express: Server) => {
      eventEmitter.removeAllListeners();
      arIODataSource.stopUpdatingPeers();
      dataSqliteWalCleanupWorker?.stop();
+      await arnsResolverCache.close();


This protects against multiple calls to `nameResolver.resolver(name)` while resolving a single name. The cache is given a short TTL and requests are removed after they are resolved. The best way to verify this behavior is trigger concurrent requests to resolve an arns name (e.g. using `hey`) and observing the logs to resolve the name only appear once for all the requests. Once the original promise is resolved, every outstanding request is resolved with the result and the name & promise are removed from the request cache.

djwhitt

Do you think it's worth putting a circuit breaker around the AO requests?

djwhitt · 2024-09-10T17:56:46Z

src/resolution/composite-arns-resolver.ts

+            resolver,
+            message: error.message,
+            stack: error.stack,
+          });
        }
      }


Should we fall back to the cached resolution here, perhaps with some staleness threshold, if we have one?

sure, can add that, we can use the TTL of the cache as the staleness threshold. that would mean if the cache has it, and we can't fetch anything new - return what the cache has until it expires.

modified to return the cached resolution data on error - if we have it - here - b9e4fe6

dtfiedler · 2024-09-10T18:18:50Z

Do you think it's worth putting a circuit breaker around the AO requests?

I do think we should, especially considering how long its taken to resolve the issues we've seen today. Will add.

dtfiedler · 2024-09-10T18:43:55Z

Do you think it's worth putting a circuit breaker around the AO requests?

I do think we should, especially considering how long its taken to resolve the issues we've seen today. Will add.

PR here to add circuit breaker in this PR: #201

This should help avoid slamming AO when there are intermittent issues

…solvers If we have the resolution in the cache, and we fail to fetch new data from the resolvers, returned the cached resolution data. This data will expire based on the the ARNS_CACHE_TTL_SECONDS config variable.

dtfiedler commented Sep 9, 2024

View reviewed changes

dtfiedler force-pushed the PE-6695-redis-arns branch from 3c62e33 to 22b8f71 Compare September 9, 2024 22:25

dtfiedler force-pushed the PE-6695-redis-arns branch from 22b8f71 to 037417e Compare September 9, 2024 22:26

chore(kv): create abstract KvArnsStore to handle the hashing keys f…

3b09c9e

…or arns in cache This is to avoid doing it in the composite resolver

dtfiedler marked this pull request as ready for review September 9, 2024 22:41

dtfiedler requested review from djwhitt and karlprieb September 9, 2024 22:41

dtfiedler changed the title ~~feat(arns): remove resolver, add redis support for arns cache~~ feat(PE-6695/PE-6696): remove resolver, add redis support for arns cache Sep 9, 2024

dtfiedler added 3 commits September 9, 2024 16:48

chore(arns): remove standalone and memory cache resolvers

ecb7574

chore(prom): update arns resolution time metric name

379e027

chore(logs): add url to redis kv store

0041495

dtfiedler mentioned this pull request Sep 10, 2024

feat(redis): implement redis cache for arns name resolutions ar-io/arns-resolver#17

Closed

karlprieb reviewed Sep 10, 2024

View reviewed changes

karlprieb approved these changes Sep 10, 2024

View reviewed changes

dtfiedler added 2 commits September 10, 2024 07:45

fix(arns): remove await to cache, fix config var to resolver

f1bd0cf

karlprieb approved these changes Sep 10, 2024

View reviewed changes

djwhitt reviewed Sep 10, 2024

View reviewed changes

dtfiedler added 3 commits September 10, 2024 13:25

feat(arns): add circuit breaker for calling the IO network process

410cb5f

This should help avoid slamming AO when there are intermittent issues

fix(arns): update circuit breaker default timeouts

b8afa1c

dtfiedler requested a review from djwhitt September 10, 2024 20:05

djwhitt approved these changes Sep 10, 2024

View reviewed changes

djwhitt merged commit 5950f8e into develop Sep 10, 2024
5 checks passed

djwhitt deleted the PE-6695-redis-arns branch September 10, 2024 20:51

coderabbitai bot mentioned this pull request Sep 20, 2024

perf(arns): cache resolution for non-existent name #208

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(PE-6695/PE-6696): remove resolver, add redis support for arns cache #200

feat(PE-6695/PE-6696): remove resolver, add redis support for arns cache #200

dtfiedler commented Sep 9, 2024 •

edited

Loading

codecov bot commented Sep 9, 2024 •

edited

Loading

dtfiedler Sep 9, 2024

coderabbitai bot commented Sep 9, 2024 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

karlprieb Sep 10, 2024

dtfiedler Sep 10, 2024 •

edited

Loading

karlprieb Sep 10, 2024

djwhitt left a comment

djwhitt Sep 10, 2024

dtfiedler Sep 10, 2024

dtfiedler Sep 10, 2024 •

edited

Loading

dtfiedler commented Sep 10, 2024

dtfiedler commented Sep 10, 2024

feat(PE-6695/PE-6696): remove resolver, add redis support for arns cache #200

feat(PE-6695/PE-6696): remove resolver, add redis support for arns cache #200

Conversation

dtfiedler commented Sep 9, 2024 • edited Loading

Logs

Metrics

codecov bot commented Sep 9, 2024 • edited Loading

Codecov Report

dtfiedler Sep 9, 2024

Choose a reason for hiding this comment

coderabbitai bot commented Sep 9, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

karlprieb Sep 10, 2024

Choose a reason for hiding this comment

dtfiedler Sep 10, 2024 • edited Loading

Choose a reason for hiding this comment

Test: 100 concurrent requests for an arns name

Before:

After

karlprieb Sep 10, 2024

Choose a reason for hiding this comment

djwhitt left a comment

Choose a reason for hiding this comment

djwhitt Sep 10, 2024

Choose a reason for hiding this comment

dtfiedler Sep 10, 2024

Choose a reason for hiding this comment

dtfiedler Sep 10, 2024 • edited Loading

Choose a reason for hiding this comment

dtfiedler commented Sep 10, 2024

dtfiedler commented Sep 10, 2024

dtfiedler commented Sep 9, 2024 •

edited

Loading

codecov bot commented Sep 9, 2024 •

edited

Loading

coderabbitai bot commented Sep 9, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

dtfiedler Sep 10, 2024 •

edited

Loading

dtfiedler Sep 10, 2024 •

edited

Loading