-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RHELC-1124] Refactor applock to have a more robust API. #979
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #979 +/- ##
=======================================
Coverage 95.54% 95.54%
=======================================
Files 53 53
Lines 4642 4648 +6
Branches 815 816 +1
=======================================
+ Hits 4435 4441 +6
- Misses 129 132 +3
+ Partials 78 75 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
convert2rhel/applock.py
Outdated
:returns: the file contents as an integer, or None if it doesn't exist | ||
:raises: ApplicationLockedError if the contents are corrupt | ||
""" | ||
if os.path.exists(self._pidfile): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: To avoid races, it's better to do a try: except:
around the open()
rather than check for file existence first.
convert2rhel/applock.py
Outdated
# Our process ID | ||
self._pid = os.getpid() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Our process ID | |
self._pid = os.getpid() | |
# Our process ID. We save this when the lock is created so it will be consistent even if we check from inside a fork. | |
self._pid = os.getpid() |
convert2rhel/applock.py
Outdated
|
||
pid = self._read_pidfile() | ||
if pid == self._pid: | ||
return | ||
if self._pid_exists(pid): | ||
raise ApplicationLockedError("%s locked by process %d" % (self._pidfile, pid)) | ||
# The lock file was created by a process that has exited; | ||
# remove it and try again. | ||
loggerinst.info("Cleaning up lock held by exited process %d." % pid) | ||
os.unlink(self._pidfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race condition here too, where something else may have already unlinked
the file so we need to catch the exception.
Also.... I think the race could also mean that self._pidfile might not contain the same pid as we checked with _pid_exists
(meaning between us reading the pid and unlinking
the file, another process has unlinked the original and created their new one in its place.) Is there some way to fix that with some combination of counting links to the lockfile, adding one new link to the file, and checking whether the count of links matches how many we think it should at each step of the way? (And if it doesn't match, having some sort of time.sleep(random)
before retrying?) (I'm not sure if there is a way or not. I think there might be but a quick internet search didn't turn up a recipe so I might be overly optimisitc. If not, it's probably better to err on the side of telling the user to check whether the lock is stale rather than overwriting the lock).
@jochapma there's 3 merge commits in the PR. If you can, drop them and rebase with main instead |
f6cd858
to
bbbf692
Compare
bbbf692
to
431844e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall seems fine, some exceptions being re-raised but not with the same exception. There is a file open that is never closed, which could be an issue
# In Python 3 this could be changed to FileNotFoundError. | ||
if exc.errno == errno.ENOENT: | ||
return None | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we should be raising exception again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to leave the file open and close it later explicitly because the file lock is tied to the file descriptor. The file lock is required to plug a race condition in which the pid file is overwritten by another process between the point where we read its contents and unlink it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bare raise statement reraises the current exception; @abadger requested that I use this idiom.
convert2rhel/applock.py
Outdated
except OSError as exc: | ||
# In Python 3 this could be changed to FileNotFoundError. | ||
if exc.errno == errno.ENOENT: | ||
return | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't covered by tests
convert2rhel/applock.py
Outdated
"""Release the advisory file lock we hold on the PID file | ||
and close the open file descriptor.""" | ||
if self._pidfile_fp: | ||
fcntl.flock(self._pidfile_fp, fcntl.LOCK_UN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? Comments would be nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smaller comments, will look at tests next. The flags didn't seen intuitive or something we see every day so having comments of what the flags does would be nice
convert2rhel/applock.py
Outdated
with open(self._pidfile, "r") as filep: | ||
fcntl.flock(filep, fcntl.LOCK_EX) | ||
try: | ||
file_contents = filep.read() | ||
pid = int(file_contents.rstrip()) | ||
if pid == self._pid: | ||
return | ||
if self._pid_exists(pid): | ||
raise ApplicationLockedError("%s locked by process %d" % (self._pidfile, pid)) | ||
# The lock file was created by a process that has exited; | ||
# remove it and try again. | ||
loggerinst.info("Cleaning up lock held by exited process %d." % pid) | ||
self._safe_unlink() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not reuse the logic of is_locked()
? Otherwise we might change it in the future on one part and miss the other etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unable to find a way to avoid a race condition in unlink without keeping a file pointer open across multiple methods, which all my reviewers didn't like. So I removed the automatic cleanup; this also has the benefit of simplifying everything considerably, a good idea since this is my last week.
What this does is add another "can't happen" case (if the lock file somehow doesn't get cleaned up), but that can be easily remedied manually.
In the course of doing that, I addressed the is_locked()
reuse.
Co-authored-by: Freya Gustavsson <[email protected]>
Co-authored-by: Freya Gustavsson <[email protected]>
Co-authored-by: Freya Gustavsson <[email protected]>
Co-authored-by: Freya Gustavsson <[email protected]>
Co-authored-by: Freya Gustavsson <[email protected]>
/packit build |
[RHELC-1124] Refactor ApplicationLock to have a more consistent API
Jira Issues: RHELC-1124
Checklist
[RHELC-]
is part of the PR titleRelease Pending
if relevant