-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce job history dir length #561
base: mainline
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ | |
import datetime | ||
import glob | ||
import os | ||
import sys | ||
|
||
from ..config import get_setting | ||
from ._yaml import deadline_yaml_dump | ||
|
@@ -34,10 +35,6 @@ def create_job_history_bundle_dir(submitter_name: str, job_name: str) -> str: | |
char for char in submitter_name if char.isalnum() or char in " -_" | ||
) | ||
|
||
# Clean the job_name's characters and truncate for the filename | ||
job_name_cleaned = "".join(char for char in job_name if char.isalnum() or char in " -_") | ||
job_name_cleaned = job_name_cleaned[:128] | ||
|
||
timestamp = datetime.datetime.now() | ||
month_tag = timestamp.strftime("%Y-%m") | ||
date_tag = timestamp.strftime("%Y-%m-%d") | ||
|
@@ -53,8 +50,26 @@ def create_job_history_bundle_dir(submitter_name: str, job_name: str) -> str: | |
latest_dir = existing_dirs[-1] | ||
number = int(os.path.basename(latest_dir)[len(date_tag) + 1 :].split("-", 1)[0]) + 1 | ||
|
||
result = os.path.join( | ||
month_dir, f"{date_tag}-{number:02}-{submitter_name_cleaned}-{job_name_cleaned}" | ||
) | ||
job_dir_prefix = f"{date_tag}-{number:02}-{submitter_name_cleaned}-" | ||
|
||
max_job_name_prefix = 128 # max job name from OpenJD spec | ||
|
||
# max path length - manifest file name | ||
# 256 - len("\manifests\d2b2c3102af5a862db950a2e30255429_input") | ||
# = 207 | ||
if sys.platform in ["win32", "cygwin"]: | ||
max_job_name_prefix = min( | ||
207 - len(os.path.abspath(os.path.join(month_dir, job_dir_prefix))), max_job_name_prefix | ||
) | ||
Comment on lines
+61
to
+63
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have we considered what happens if this ends up being 0 or a negative number? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added error handling for this case! |
||
if max_job_name_prefix < 1: | ||
raise RuntimeError( | ||
"Job history directory is too long. Please update your 'settings.job_history_dir' to a shorter path." | ||
) | ||
Comment on lines
+64
to
+67
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for the delay, unfortunately due to the customer experience impact we can't do this change. A valid windows submission that works with long paths will now fail to submit because we won't create the job history dir (to be used as input to the create job bundle callback for where to create the submission files). In my mind, there's 2 part of this at play:
I think short-term we can make the second issue less likely and do the truncation but only if it would fail otherwise. We can't have previously working setups failing due to this. If we can successfully create the history directory and a file whose path matches the longest path therein, then no truncation should happen. Longer-term I think we should absolutely remove the truncation and:
Let me know your thoughts |
||
|
||
# Clean the job_name's characters and truncate for the filename | ||
job_name_cleaned = "".join(char for char in job_name if char.isalnum() or char in " -_") | ||
job_name_cleaned = job_name_cleaned[:max_job_name_prefix] | ||
|
||
result = os.path.join(month_dir, f"{job_dir_prefix}{job_name_cleaned}") | ||
joel-wong-aws marked this conversation as resolved.
Show resolved
Hide resolved
|
||
os.makedirs(result) | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: MAX_PATH is actually 260: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the official MAX_PATH is 260, there may be some DCC constraints that limit this to 256 characters, so I decided to go with that. This also gives us a few characters of wiggle room.