Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auth issue with extraction #1758

Open
woopla opened this issue Jan 5, 2025 · 6 comments
Open

Auth issue with extraction #1758

woopla opened this issue Jan 5, 2025 · 6 comments

Comments

@woopla
Copy link

woopla commented Jan 5, 2025

Describe the bug
When trying to run the extraction on already ingested documents (they all show up as pending for the Extraction column), the logs show a 403 error:

> docker compose -f compose.full.yaml --project-name r2r-full logs -f
...
r2r-1                   | 2025-01-05 02:13:27 - INFO - 172.19.0.1:38686 - "OPTIONS /v3/documents/7bb4bbe0-a6b2-52b8-b2cb-2b98bfbcb02a/extract HTTP/1.1" 200
r2r-1                   | 2025-01-05 02:13:27 - WARNING - 172.19.0.1:38686 - "POST /v3/documents/7bb4bbe0-a6b2-52b8-b2cb-2b98bfbcb02a/extract HTTP/1.1" 403

Unfortunately, that's all the info I'm getting.

To Reproduce
Steps to reproduce the behavior:

  1. Start R2R using Docker with the local LLM option (I'm using Full here, but the same happened with Lite, which is why I tried Full).
  2. Login
  3. Upload some documents, wait for the ingestion to finish
  4. Click on the Document Extraction icon on any of the documents
  5. Look at the logs, see the 403 call when doing a POST on v3/documents/<doc ID>

Expected behavior
Extraction goes through.

Additional context
I'm running a Git clone of the repo with:

docker build -t r2r/local-fix .
poetry run r2r serve --docker --full --config-name=full_local_llm --image=r2r/local-fix:latest

to bypass the issue mentioned in #1386

@NolanTrem
Copy link
Collaborator

NolanTrem commented Jan 5, 2025

Can you confirm the auth settings have not changed in your config as well as who is marked as the owner of the documents?

@woopla
Copy link
Author

woopla commented Jan 5, 2025

Hey @NolanTrem , I'm using the default config in py/core/configs/full_local_llm.toml for auth, and the user is an account I created with my email (I did not use the default user, I was confused by the UI :D ).

Logging in with the superuser account, I am able to start the extraction, no 403 \o/

But that brings more questions:

  1. Why is a default user able to import documents, but not run the extraction on them?
  2. Why isn't the extraction part running by default?

@NolanTrem
Copy link
Collaborator

  1. I think there might be some confusion here—the superuser is the default user, so if a user that you created does not have access to a document, they will not be able to run extraction on it.
  2. Extraction can be an expensive process, and could overwhelm a production server with many users uploading files at the same time. By making extraction require an explicit start, we hope to give developers more control over the system rather than doing things automatically for them.

@woopla
Copy link
Author

woopla commented Jan 8, 2025

Thanks @NolanTrem that makes sense on the second point.

For the first point, I did upload the documents with this non-default user. While I can press the extract button and get a UI notification that it's been queued, the logs show a 403.

@ArturTanona
Copy link
Contributor

Same issue

@ArturTanona
Copy link
Contributor

You have breaking changes with dashboard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants