Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing logging levels #222

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Changing logging levels #222

wants to merge 10 commits into from

Conversation

zhenghh04
Copy link
Member

@zhenghh04 zhenghh04 commented Aug 27, 2024

In this PR, we changed the per step output from info to debug to reduce the logging overhead.

We also add support for changing logging level

Copy link
Collaborator

@hariharan-devarajan hariharan-devarajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add error logging as an option for pure performance mode.
For the final per metric we should use a different logging, which outputs PRINT messages to STDOUT and redirects to the dlio.log file.

The other logging is optional from the benchmark.

dlio_benchmark/common/enumerations.py Outdated Show resolved Hide resolved
@hariharan-devarajan
Copy link
Collaborator

@zhenghh04 Can we not print the file name and line number for info logging. We should only do that for debug logging. This will significantly reduce log size.

@zhenghh04
Copy link
Member Author

@hariharan-devarajan , made the changes as you suggested. Please review it again.

Copy link
Collaborator

@hariharan-devarajan hariharan-devarajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are almost there. I see we have some code which got reverted from the mock logging.

dlio_benchmark/utils/utility.py Show resolved Hide resolved
dlio_benchmark/utils/utility.py Show resolved Hide resolved
dlio_benchmark/utils/utility.py Outdated Show resolved Hide resolved
@zhenghh04
Copy link
Member Author

@hariharan-devarajan , I am hesitating between using workflow.log_level to control the log level, vs using DLIO_LOG_LEVEL environment variable to control it. I slightly lean towards to the latter one, which is more common in other apps. What is your preference?

@hariharan-devarajan
Copy link
Collaborator

@hariharan-devarajan , I am hesitating between using workflow.log_level to control the log level, vs using DLIO_LOG_LEVEL environment variable to control it. I slightly lean towards to the latter one, which is more common in other apps. What is your preference?

I see a value in both. But I prefer the environment variable, too. By default, we should have WARN level, not info. Then we can make DLIO_LOG_LEVEL to higher logging levels like INFO and DEBUG.

We should switch the per epoch time to print and include the sample rate per epoch there.

Then, per step becomes info.

And variable logging becomes debug.

@zhenghh04
Copy link
Member Author

@hariharan-devarajan After reading the documentation more: https://docs.python.org/3/library/logging.html#levels, I feel that we should have default logging level to be info, and everything should go through logging, not through print.

What do you think?

@hariharan-devarajan
Copy link
Collaborator

So printing is different than logging in my opinion.

The things that tell us about benchmark high-level like initialize, progress, and metrics.

Logging here is for internal parts.

I wrote a logger in the past on c++ in which I added print as the highest log above error.

For benchmarks I feel it makes sense to have this.

@hariharan-devarajan
Copy link
Collaborator

Now if u want to use logging for printing, I would create two loggers. One for printing and one for logging internal stuff.

The printing could always be info, internal, and cannot change by benchmark parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants