Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharelink file listings on web times out #124

Open
aputtu opened this issue Sep 25, 2024 · 0 comments
Open

Sharelink file listings on web times out #124

aputtu opened this issue Sep 25, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@aputtu
Copy link

aputtu commented Sep 25, 2024

Issue

Users experience that listing of files hangs, when requesting file listings.
In worst case the listing simply time out.

Desired solution

As a user I want (1) fast listing of at least some files, and (2) that the file listings do not time out, even then there is a very, very large number of files.
For the migrid instance we want correct, efficient and well-performing file listings.
In the code base we ideally want tests and benchmarks on very large file listings.

Assumed reasons for the issue

Our assumption is that is happens in sharelinks listings, when there is a lot of links.
In internal ticket #33432 we got a report from user, that ls also was hanging when run on a network drive.

From Apache logs (ssl-sid-error.log) we get warnings and errors like:

[cgi-warn] <...> Timeout waiting for output from CGI script /home/mig/mig/cgi-sid/ls.py
[cgi-error] <...> Script timed out before returning headers: ls.py

The above were triggered by ls.py when user were opening sharelink on web:

<...> "GET /cgi-sid/ls.py?share_id=abcabcAb <...> HTTP/1.1" 504 ..

The assumed reason is that 300s timeout occurs before the list to be displayed has been built.
The performance of the file system is part of the problem.

Suggestions for fix

  • Reuse the file manager we use in Files and in choose file on createfreeze.
  • The tasks involves (1) server side implementation of pagination in Python code, (2) front end pagination with AJAX, so that we load smaller chunks to reduce initial load time and preventing timeouts.

AI tips for what it is worth

  • WebSocket for real-time updates: Use WebSockets to establish a persistent connection between the client and server. This allows you to stream file listings in real-time as they're processed.
  • Background processing with task queue: Implement a task queue system (like Celery for Python) to process file listing in the background. The web interface can then poll for results or use WebSockets to get updates.
  • Caching: Implement caching mechanisms to store file listings for a certain period. This can significantly reduce load times for frequently accessed directories.
  • Optimize the file listing process: Instead of using os.listdir(), consider using more efficient methods like os.scandir() for large directories. You can also implement multi-threading to process large directories faster.
  • Progressive loading with virtual scrolling: Implement virtual scrolling in your front-end, where only the visible items are rendered. This can handle extremely large lists efficiently.
  • Optimize file system through various strategies.
@aputtu aputtu added the bug Something isn't working label Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant