fix: harden storage semantics #4118

ashwinb · 2025-11-10T22:00:17Z

Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Three related fixes:

Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users.
The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite).
The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run.

Closes #4115

Test Plan

Added a matrix entry to test our "base" suite against Postgres as the store.

Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users. The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite). The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run.

ashwinb · 2025-11-10T23:38:11Z

src/llama_stack/distributions/starter/starter.py

-                                responses_store=postgres_config,
-                            ),
+                            config=MetaReferenceAgentsImplConfig(
+                                persistence=AgentPersistenceConfig(


important bug fix, unrelated to this one but really the postgres definition was all kinds of wrong

src/llama_stack/providers/utils/inference/inference_store.py

src/llama_stack/providers/utils/responses/responses_store.py

ashwinb · 2025-11-12T18:40:46Z

@Mergifyio backport release-0.3.x

mergify · 2025-11-12T18:40:54Z

backport release-0.3.x

✅ Backports have been created

#4138 fix: harden storage semantics (backport #4118) has been created for branch release-0.3.x but encountered conflicts

Fixes issues in the storage system by guaranteeing immediate durability for responses and ensuring background writers stay alive. Three related fixes: * Responses to the OpenAI-compatible API now write directly to Postgres/SQLite inside the request instead of detouring through an async queue that might never drain; this restores the expected read-after-write behavior and removes the "response not found" races reported by users. * The access-control shim was stamping owner_principal/access_attributes as SQL NULL, which Postgres interprets as non-public rows; fixing it to use the empty-string/JSON-null pattern means conversations and responses stored without an authenticated user stay queryable (matching SQLite). * The inference-store queue remains for batching, but its worker tasks now start lazily on the live event loop so server startup doesn't cancel them—writes keep flowing even when the stack is launched via llama stack run. Closes #4115 ### Test Plan Added a matrix entry to test our "base" suite against Postgres as the store. (cherry picked from commit 492f79c) # Conflicts: # .github/workflows/integration-tests.yml # llama_stack/distributions/ci-tests/run-with-postgres-store.yaml # llama_stack/distributions/starter-gpu/run.yaml # llama_stack/distributions/starter/run.yaml # llama_stack/distributions/starter/starter.py # llama_stack/providers/utils/inference/inference_store.py # llama_stack/providers/utils/responses/responses_store.py # tests/integration/ci_matrix.json

ashwinb requested review from bbrowning, ehhuang, franciscojavierarceo, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners November 10, 2025 22:00

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 10, 2025

ashwinb added 7 commits November 10, 2025 14:21

add a workflow to test against postgres

680f44f

clearer workflow file

93b9c65

update distro

a686beb

rename

14372dd

better

46715e7

fix

5fdedff

update distro definition for starter wow

7bd4c09

ashwinb commented Nov 10, 2025

View reviewed changes

ashwinb added 2 commits November 10, 2025 15:42

update fixture

e6de865

more fixes to postgres-store run yaml ugh

7ce0c5c

ehhuang approved these changes Nov 11, 2025

View reviewed changes

src/llama_stack/providers/utils/inference/inference_store.py Show resolved Hide resolved

src/llama_stack/providers/utils/inference/inference_store.py Outdated Show resolved Hide resolved

src/llama_stack/providers/utils/responses/responses_store.py Outdated Show resolved Hide resolved

leseb approved these changes Nov 12, 2025

View reviewed changes

ashwinb added 2 commits November 12, 2025 10:17

Merge remote-tracking branch 'origin/main' into storage_fix

08024d4

killed unnecessary logs

531e003

ashwinb merged commit 492f79c into main Nov 12, 2025
58 checks passed

ashwinb deleted the storage_fix branch November 12, 2025 18:35

mergify bot mentioned this pull request Nov 12, 2025

fix: harden storage semantics (backport #4118) #4138

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: harden storage semantics #4118

fix: harden storage semantics #4118

ashwinb commented Nov 10, 2025 •

edited

Loading

Uh oh!

ashwinb Nov 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashwinb commented Nov 12, 2025

Uh oh!

mergify bot commented Nov 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fix: harden storage semantics #4118

fix: harden storage semantics #4118

Conversation

ashwinb commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Plan

Uh oh!

ashwinb Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashwinb commented Nov 12, 2025

Uh oh!

mergify bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Backports have been created

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ashwinb commented Nov 10, 2025 •

edited

Loading

mergify bot commented Nov 12, 2025 •

edited

Loading