Auth issue with extraction #1758

woopla · 2025-01-05T02:20:56Z

Describe the bug
When trying to run the extraction on already ingested documents (they all show up as pending for the Extraction column), the logs show a 403 error:

> docker compose -f compose.full.yaml --project-name r2r-full logs -f
...
r2r-1                   | 2025-01-05 02:13:27 - INFO - 172.19.0.1:38686 - "OPTIONS /v3/documents/7bb4bbe0-a6b2-52b8-b2cb-2b98bfbcb02a/extract HTTP/1.1" 200
r2r-1                   | 2025-01-05 02:13:27 - WARNING - 172.19.0.1:38686 - "POST /v3/documents/7bb4bbe0-a6b2-52b8-b2cb-2b98bfbcb02a/extract HTTP/1.1" 403

Unfortunately, that's all the info I'm getting.

To Reproduce
Steps to reproduce the behavior:

Start R2R using Docker with the local LLM option (I'm using Full here, but the same happened with Lite, which is why I tried Full).
Login
Upload some documents, wait for the ingestion to finish
Click on the Document Extraction icon on any of the documents
Look at the logs, see the 403 call when doing a POST on v3/documents/<doc ID>

Expected behavior
Extraction goes through.

Additional context
I'm running a Git clone of the repo with:

docker build -t r2r/local-fix .
poetry run r2r serve --docker --full --config-name=full_local_llm --image=r2r/local-fix:latest

to bypass the issue mentioned in #1386

The text was updated successfully, but these errors were encountered:

NolanTrem · 2025-01-05T02:26:16Z

Can you confirm the auth settings have not changed in your config as well as who is marked as the owner of the documents?

woopla · 2025-01-05T02:50:15Z

Hey @NolanTrem , I'm using the default config in py/core/configs/full_local_llm.toml for auth, and the user is an account I created with my email (I did not use the default user, I was confused by the UI :D ).

Logging in with the superuser account, I am able to start the extraction, no 403 \o/

But that brings more questions:

Why is a default user able to import documents, but not run the extraction on them?
Why isn't the extraction part running by default?

NolanTrem · 2025-01-07T21:45:01Z

I think there might be some confusion here—the superuser is the default user, so if a user that you created does not have access to a document, they will not be able to run extraction on it.
Extraction can be an expensive process, and could overwhelm a production server with many users uploading files at the same time. By making extraction require an explicit start, we hope to give developers more control over the system rather than doing things automatically for them.

woopla · 2025-01-08T22:11:19Z

Thanks @NolanTrem that makes sense on the second point.

For the first point, I did upload the documents with this non-default user. While I can press the extract button and get a UI notification that it's been queued, the logs show a 403.

ArturTanona · 2025-01-10T13:22:11Z

Same issue

ArturTanona · 2025-01-10T13:22:22Z

You have breaking changes with dashboard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auth issue with extraction #1758

Auth issue with extraction #1758

woopla commented Jan 5, 2025

NolanTrem commented Jan 5, 2025 •

edited

Loading

woopla commented Jan 5, 2025

NolanTrem commented Jan 7, 2025

woopla commented Jan 8, 2025

ArturTanona commented Jan 10, 2025

ArturTanona commented Jan 10, 2025

Auth issue with extraction #1758

Auth issue with extraction #1758

Comments

woopla commented Jan 5, 2025

NolanTrem commented Jan 5, 2025 • edited Loading

woopla commented Jan 5, 2025

NolanTrem commented Jan 7, 2025

woopla commented Jan 8, 2025

ArturTanona commented Jan 10, 2025

ArturTanona commented Jan 10, 2025

NolanTrem commented Jan 5, 2025 •

edited

Loading