Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance timeout check to avoid false positives during processing [of video download to IIAB] #170

Merged
merged 8 commits into from
Jun 4, 2024
27 changes: 12 additions & 15 deletions cps/tasks/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
from datetime import datetime
from flask_babel import lazy_gettext as N_, gettext as _

from cps.constants import XKLB_DB_FILE
from cps.services.worker import CalibreTask, STAT_FINISH_SUCCESS, STAT_FAIL, STAT_STARTED, STAT_WAITING
from cps.subproc_wrapper import process_open
from ..constants import XKLB_DB_FILE
from ..services.worker import CalibreTask, STAT_FINISH_SUCCESS, STAT_FAIL, STAT_STARTED, STAT_WAITING
from ..subproc_wrapper import process_open
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute imports are usually preferred because relative imports (..) can lead to confusion in module resolution.

Does an absolute path starting from the package name work?

Suggested change
from ..constants import XKLB_DB_FILE
from ..services.worker import CalibreTask, STAT_FINISH_SUCCESS, STAT_FAIL, STAT_STARTED, STAT_WAITING
from ..subproc_wrapper import process_open
from calibreweb.cps.constants import XKLB_DB_FILE
from calibreweb.cps.services.worker import CalibreTask, STAT_FINISH_SUCCESS, STAT_FAIL, STAT_STARTED, STAT_WAITING
from calibreweb.cps.subproc_wrapper import process_open

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute paths work. I was not comfortable seeing them flagged as unresolved in vscode. Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove the package name from absolute imports. It will work starting with cps.

from .. import logger
from time import sleep

Expand Down Expand Up @@ -51,16 +51,16 @@ def run(self, worker_thread):

complete_progress_cycle = 0

last_progress_time = datetime.now()
fragment_stuck_timeout = 30 # seconds
fragment_stuck_time = 0

while p.poll() is None:
# Check if there's data available to read
rlist, _, _ = select.select([p.stdout], [], [], 0.1)
if rlist:
line = p.stdout.readline()
if line:
if re.search(pattern_success, line):
if re.search(pattern_success, line) or complete_progress_cycle == 4:
# 2024-01-10: 99% (a bit arbitrary) is explained here...
# https://github.com/iiab/calibre-web/pull/88#issuecomment-1885916421
self.progress = 0.99
Expand All @@ -73,17 +73,14 @@ def run(self, worker_thread):
self.progress = min(0.99, (complete_progress_cycle + (percentage / 100)) / 4)
if percentage == 100:
complete_progress_cycle += 1
if complete_progress_cycle == 4:
break
last_progress_time = datetime.now()
else:
fragment_stuck_time += 0.1
if fragment_stuck_time >= fragment_stuck_timeout:
log.error("Download appears to be stuck.")
self.record_error_in_database("Download appears to be stuck.")
raise ValueError("Download appears to be stuck.")
elapsed_time = (datetime.now() - last_progress_time).total_seconds()
if elapsed_time >= fragment_stuck_timeout:
self.message = f"Downloading {self.media_url_link}... (This is taking longer than expected)"

sleep(0.1)

p.wait()

# Database operations
Expand Down Expand Up @@ -118,7 +115,7 @@ def run(self, worker_thread):
else:
log.error("Failed to send the requested file to %s", self.original_url)
self.message = f"{self.media_url_link} failed to download: {response.status_code} {response.reason}"

conn.close()

except Exception as e:
Expand All @@ -132,7 +129,7 @@ def run(self, worker_thread):
self.stat = STAT_FINISH_SUCCESS
log.info("Download task for %s completed successfully", self.media_url)
else:
self.end_time = datetime.now()
self.end_time = datetime.now()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If end_time needs to be set on both branches (lines 128 and 132), do it just once above the if.

By the way, this class sets end_time in multiple places (e.g. line 26 and 72). If the goal is to record the end time after the task completes, can these lines be eliminated?

Note: in case end_time ends up being unset, CalibreTask.runtime() will fail. The finally should catch all exit paths from the while loop, except when self.media_url is unset (which skips the entire thing).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to record the run time dynamically per
https://github.com/iiab/calibre-web/blob/eb8302387b75714c9ab8addafac5df8af74e93c8/cps/services/worker.py#L232C5-L235C1
I'll remove the redundancy for 128 and 132, but if I remove 26 and 72, I will lose the dynamic run time messages. I'll move 72 to be more efficient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in de4fdce

self.stat = STAT_FAIL

else:
Expand Down