-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve timer #5672
base: main
Are you sure you want to change the base?
Improve timer #5672
Conversation
3169a13
to
86daea4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Overall a really good change. I left some comments inline.
tests/unittests/config/test_cc_package_update_upgrade_install.py
Outdated
Show resolved
Hide resolved
tests/unittests/config/test_cc_package_update_upgrade_install.py
Outdated
Show resolved
Hide resolved
assert should_auto_attach_value == _should_auto_attach(ua_section) | ||
assert ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm...I think this highlights a gap in functionality where we might wind up losing valuable logging messages that no longer take as much time to execute. Looking at the Pro module now, there's now no log anywhere to indicate that we're attempting to attach to Pro unless it is taking too long, and I'm assuming that's not what we want. I'm not sure I have a great solution other than looking through all of the changed call sites and using a threshold of 0 for those we think we want to log regardless of the time taken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I reworked the context manager to support a parameter called log_mode
, with three options: "always", "threshold", and "skip" (default is "threshold"). I audited the modified callsites and set "always" to those that I think need it. Let me know if there are any others that you think should also be modified.
dfa55df
to
446fee4
Compare
The logic in Timed() has defaults which are designed to avoid low-value log messages. A log is only emitted when the threshold is exceeded. Additionally, the log_mode parameter allows the caller to prevent Timed() logs, so that the caller may coalesce the message into another log. - Make timer helper into a context manager. - Add new decorator for timing functions. - Simplify timer helper function, eliminate obsolete logic. - Eliminate redundant logs and timer calls.
refactors existing code to utilize timer codepaths: - replace manual timer implementations in subp.py and sockets.py - replace open() / read() calls with util.load_text_file() where appropriate
446fee4
to
aeec429
Compare
@TheRealFalcon thanks for the review. I think that I've addressed all of your comments. Ready for re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few more comments, but everything is mostly minor.
I really like the new decorator.
@@ -1273,15 +1277,15 @@ def sub_main(args): | |||
report_on = True | |||
if name == "init": | |||
if args.local: | |||
rname, rdesc = ("init-local", "searching for local datasources") | |||
rname, rdesc = ("local", "searching for local datasources") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the impetus for these next 3 changes? Their replacements are less greppable, the first log message of each stage still uses the original name, and we have other tons of references to the original names all over the code base. I would prefer to keep the originals.
args=(name, args), | ||
) | ||
with performance.Timed(f"cloud-init stage: '{rname}'"): | ||
retval = functor(name, args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_uptime
was kind of an odd kwarg to have in the old function, and it looks like here is the only place it was used. I'm not sure if/how it was useful, but do you think it makes sense to throw an uptime log as early as possible in init-local as a replacement?
msg="backgrounded Resizing", | ||
func=do_resize, | ||
args=(resize_cmd,), | ||
do_resize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the removal here intentional?
def __init__( | ||
self, | ||
msg: str, | ||
/, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason to enforce this?
run_name, mod.handle, func_args, freq=freq | ||
) | ||
|
||
with performance.Timed("and", log_mode="skip") as timer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still feels a bit hard to understand at first. What do you think about having this message be "" and then later f"{run_name} ran successfully and took {timer.delta:.3f} seconds"
?
Not a blocking comment if you prefer it as-is.
# open filename in mode 'omode', write content, set permissions to 'mode' | ||
"""open filename in mode omode, write content, set permissions to mode""" | ||
|
||
with performance.Timed(f"Writing {filename}"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the decorator be used here now instead?
default_flow_style=False, | ||
Dumper=(NoAliasSafeDumper if noalias else yaml.dumper.SafeDumper), | ||
) | ||
with performance.Timed("Dumping yaml"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the new decorator work here?
@@ -475,6 +483,7 @@ def clear_cached_attrs(self, attr_defaults=()): | |||
if not attr_defaults: | |||
self._dirty_cache = False | |||
|
|||
@performance.timed("Getting metadata") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably log always
/, | ||
*, | ||
threshold: float = 0.01, | ||
log_mode: str = "threshold", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this parameter being a magic string, especially when it only takes two values. Is there a reason setting threshold to 0 isn't a valid way to signal to log always? If so, could this field be a boolean like always_log: bool = False
or something similar?
Note to reviewers
This looks big, but it is broken into three logically separate changes. I would recommend reviewing a commit at a time.
Motivation
The recent addition of subp execution time was a huge benefit to exposing slow external commands. As we try to push performance more, more data is vital to making the biggest impact that we can. Comparing this branch against the tip of main, this change makes the total log size 1 line longer - but this will vary depending per-platform.
cloud-init analyze
is neat, but it is low granularity, relies log timestamps (so is only an approximation), and it requires extra stepsDescription of changes
Todo
Merge type