Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate cmor setups and logging #261

Closed
wants to merge 27 commits into from

Conversation

TonyB9000
Copy link
Contributor

@TonyB9000 TonyB9000 commented Jun 27, 2024

Description

This update to e2c addresses two areas of consolidation: setup_cmor functions (calls to cmor.setup) and e2c logging configurations. Other minor bug-fix and cleanup conducted. All changes tested against a sample from each variable type (Amon 2D, Amon 3D, day, 3hr, CFmon, Lmon, LImon, Omon, SImon).

CMOR SETUP CONSOLIDATION:

The motivation for cmor.setup reconfigurations is that this cmor function was called from 7 separate locations:

  • the setup_cmor() in util.py
  • the setup_cmor() in mpas.py
  • the _setup_cmor_module() in handler.py
  • and 4 raw calls to cmor.setup in the areacella.py, clisccp.py, orog.py and sftlf.py variable handlers.

Aside from unnecessary replication of codes, each call to cmor.setup invokes a cmor-internal "logger" setup, whose only external accessibility is the name of the logfile to supply. There is no ability to alter its propagate(True/False), date-format or other attributes. (I believe it is responsible for the "duplicated log messages" problem, as only those messages with the CMOR date-format appear to be duplicated.)

Now, there are only TWO functions that invoke cmor.setup: util.setup_cmor(), and handler._setup_cmor_module(). The latter is applied to every handler managed through the "handlers.yaml" construct. The former handles all other variables. The function in mpas.py has been eliminated.

LOGGER CONSOLIDATION:

Likewise, the motivation in consolidating the logger functions is that python logging is "global" - different modules adding stream or file handlers to their loggers accumulate across all logging.

Now, there is a single "e2c_logger()" function in _logger.py, and it is called by ANY (code-accessible) module in e2c that requires logging. It can be configures to return the root logger, or a named-logger, to apply logging to a file or to console or both, and to propagate or not.

FIXING the problem where "--help" and "--version" cause a cmor_logs directory to be created in the user's current directory required some main-line refactoring. Previously, modules like mpas.py, handler.py, etc each would call their logger setup in the global space. That global space is activated whenever a module is "imported", irrespective of whether any functions defined within have been called. And ALL of these imports occur BEFORE we get a chance to parse the main command-line args (and discover that only --help or --version was being invoked.)

Now each of these modules has its own function to be called to set-up its logger. These are imported into "main" and invoked only AFTER arg-parsing is completed (and not at all for --help and --version calls).

CHANGES BY MODULE:

Aside from virtually ALL modules now using _logger:e2c_logger() for logging operations, the following changes have been made:

  • main.py: Added "num_failures" in parallel to "num_sucesses", to enable a non-zero return status.
  • _logger.py: revamped to contain only "e2c_logger()" function.
  • util.py: generalized "cmor_setup()" to handle both mpas and non-mpas variables.
  • mpas.py: removed superfluous "cmor_setup()" function.
  • cmor_handler/mpas_vars/.py: Employ util.setup_cmor(), not mpas.setup_cmor().
  • handlers areacella, clisccp, orog, sftlf: Employ util.setup_cmor(), not direct cmor.setup calls.

MINOR CHANGES:

Replaced some - but not all ".format()" string variable formatting with f"{va}" format style.

Checklist

  • [~] My code follows the style guidelines of this project
  • [~] I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have added tests that prove my fix is effective or that my feature works
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

Copy link
Collaborator

@chengzhuzhang chengzhuzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I visually inspected the code change. Looks good over all.

@@ -60,7 +62,7 @@ def handle(infiles, tables, user_input_path, **kwargs):

ds = mpas.remap(ds, 'mpasocean', mappingFileName)

mpas.setup_cmor(VAR_NAME, tables, user_input_path, component='ocean')
util.setup_cmor(VAR_NAME, tables, TABLE, user_input_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the syntax:(var_name=VAR_NAME, table_path=tables, table_name=TABLE, user_input_path=user_input_path) is easier to follow than the simplified version(VAR_NAME, tables, TABLE, user_input_path), especially to differentiate tables and TABLE.

Copy link
Contributor Author

@TonyB9000 TonyB9000 Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it across the board using "sed". The fact that it exists in 41 modules is clear evidence that this is NOT where we should be calling "cmor.setup" (or setup_cmor). The only variables in the call parameters are VAR_NAME and TABLE, and in principle we would not even know which handler module to select if we did not already have (essentially) this information BEFORE we call the handler.

Someday ... we should consolidate this processing. I know we want to "precompile" and not "interpret" for performance reasons, buy that should be solvable with an array of pre-compiled formulas indexed by variable and table., whose inputs are pointers to locations where data will be written for access.

In the meantime, I'll make the changes and push.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted to named parameters for mpas handler calls to util.setup_cmor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chengzhuzhang Also converted 4 residual "vars" handlers (areacella, clisccp, orog, sftlf), and had to change util because the named parameter was "varname", not "var_name". Now it is and all should be consistent.

@chengzhuzhang
Copy link
Collaborator

The update looks good to me. Thank you for addressing the review comment.

@TonyB9000
Copy link
Contributor Author

OK, I will wait for Tom's review next (and continue testing with the dsm_generate_CMIP6.sh driver).

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Jul 15, 2024

OK, I will wait for Tom's review next (and continue testing with the dsm_generate_CMIP6.sh driver).

Can you paste what the log output looked like before and after this PR?

I'm aware that you updated the logger message outputs to minimize verbosity. However, I suggest keeping at least the module name and the exact line of code because it helps immensely with debugging.

@TonyB9000
Copy link
Contributor Author

@tomvothecoder For the most part, I did not change the log-output content. What I did was reduce the number of different "logger-setup" functions being called (now only the one in "_logger.py" is used, and named it "e2c_logger" to avoid conflicts.

The log output itself has not (much) changed. But where the messages get logged (default logfile, console stream, etc) may have, due to changes in "propagate", and which handlers get registered with the logger, etc. CMOR (via "cmor.setup") establishes its own handlers into the logger, and I cannot but control the name of the log-file ("cmor_logs").

@tomvothecoder
Copy link
Collaborator

CMOR (via "cmor.setup") establishes its own handlers into the logger, and I cannot but control the name of the log-file ("cmor_logs").

I created a new issue to explore including CMOR logs with the main e3sm_to_cmip log file (and removing CMOR logs if it is redundant). #266

@TonyB9000
Copy link
Contributor Author

TonyB9000 commented Jul 15, 2024

From what I've seem, the "cmor_log" messages involve issues of CMIP6 conformance in the metadata. I tend to prefer to keep those (glaring) messages separate from the regular e2c messages - they hurt my eyes and are not amenable to machine-read.
I have asked that they (CMOR) provide a fuller API interface to their log setup/formatting... just hoping...

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Jul 16, 2024

From what I've seem, the "cmor_log" messages involve issues of CMIP6 conformance in the metadata. I tend to prefer to keep those (glaring) messages separate from the regular e2c messages - they hurt my eyes and are not amenable to machine-read. I have asked that they (CMOR) provide a fuller API interface to their log setup/formatting... just hoping...

Yes I agree, the CMOR log messages aren't formatted well. I think it is a good idea to combine the e3sm_to_cmip and CMOR logs because it is easier to debug a single log file. Currently, when e3sm_to_cmip crashes, you have to parse separate log files to see if the error is due to CMOR and what the CMOR error is (most of the time unclear). We can still keep separate CMOR logs by variable too.

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TonyB9000 here is my review. I left suggestions and comments for you to address. Great work refactoring.

e3sm_to_cmip/_logger.py Outdated Show resolved Hide resolved
e3sm_to_cmip/_logger.py Outdated Show resolved Hide resolved
e3sm_to_cmip/_logger.py Outdated Show resolved Hide resolved
e3sm_to_cmip/_logger.py Outdated Show resolved Hide resolved
e3sm_to_cmip/_logger.py Outdated Show resolved Hide resolved
e3sm_to_cmip/mpas.py Show resolved Hide resolved
e3sm_to_cmip/mpas.py Outdated Show resolved Hide resolved
e3sm_to_cmip/mpas.py Outdated Show resolved Hide resolved
e3sm_to_cmip/mpas.py Show resolved Hide resolved
e3sm_to_cmip/util.py Outdated Show resolved Hide resolved
@TonyB9000
Copy link
Contributor Author

@tomvothecoder I think I covered most of the comments, save for a few. But I made the changes locally and cannot push them until I figure out the right way to do it.

@TonyB9000
Copy link
Contributor Author

@tomvothecoder @chengzhuzhang

I was finally able to push through the "merge conflicts", and address most of the recommendations and comments Tom made. There are still some residual issues. Hopefully these can now stand-out and be more carefully addressed.

I will need to do more testing before any merge-to-master occurs, but at least the major changes are now secured.

@tomvothecoder
Copy link
Collaborator

Perhaps I should add this comment to issue #266?

My comment about combining logs was a separate suggestion and should not be included in this PR. You can move your comment above to #266.

@tomvothecoder I think I've addressed each of the changes you've requested - plus added "feature issues" you had suggested. If you would like to re-review, or not, let me know.

I will do a re-review of this PR. There are also some failing unit tests that need to be fixed.

Does this PR need to be merged any time soon? I'm assuming this is a nice-to-have enhancement and not a high priority/blocker, which means we can take our time reviewing. Correct?

@TonyB9000
Copy link
Contributor Author

TonyB9000 commented Jul 23, 2024

@tomvothecoder There is no rush as far as I am concerned - take your time.

And of course, let me know where there are things I have neglected to address.

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @TonyB9000, I had some minor suggestions from this review and the previous review. Can you address them before we decide to merge? It looks like we're almost done. Thanks.

e3sm_to_cmip/__main__.py Outdated Show resolved Hide resolved
e3sm_to_cmip/__main__.py Outdated Show resolved Hide resolved
num_success = 0
num_failure = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TonyB9000 Can you open a separate GitHub issue for my comment above? If we decide to address it, then it should be performed in a separate PR.

e3sm_to_cmip/__main__.py Show resolved Hide resolved
e3sm_to_cmip/__main__.py Show resolved Hide resolved

def _setup_root_logger() -> str: # pragma: no cover
"""Sets up the root logger.
def _logger(name=None, logfilename=DEFAULT_LOG, set_log_level=None, to_console=False, to_logfile=False, propagate=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _logger(name=None, logfilename=DEFAULT_LOG, set_log_level=None, to_console=False, to_logfile=False, propagate=False):
def _logger(name=None, logfilename=DEFAULT_LOG, set_log_level=None, to_console=False, to_logfile=False, propagate=False):

Suggestions

  1. Format function definition using black and use type annotations.
  2. Rename set_log_level to log_level and pass logging levels (e.g., logging.level).
  3. Update docstring based on previous changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird - the "suggested change" (above) looks like no change, due to wrapping."

Addressed 2 and 3, and performed black.

Are type annotations really needed? There are scored of function definitions in mpas.py, etc, and none of them employ type annotations.

e3sm_to_cmip/_logger.py Outdated Show resolved Hide resolved
Comment on lines 44 to 54
if set_log_level == "None" or set_log_level == "DEBUG":
log_level = DEFAULT_LOG_LEVEL
elif set_log_level == "INFO":
log_level = logging.INFO
elif set_log_level == "WARNING":
log_level = logging.WARNING
elif set_log_level == "ERROR":
log_level = logging.ERROR
elif set_log_level == "CRITICAL":
log_level = logging.CRITICAL
else: log_level = DEFAULT_LOG_LEVEL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to remove this logic and make set_log_level the logging module levels instead of string?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also suggest renaming set_log_level to log_level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned earlier - each of the handler modules would need to "import logging", just to be able to pass the defined values "logging.WARNING", "logging.INFO", etc. I can do that. - I don't suppose any significant overhead is incurred in having each handler "import" another module ("logging", on top of our "_logger.py"). It would certainly make "_logger" cleaner.

e3sm_to_cmip/cmor_handlers/handler.py Outdated Show resolved Hide resolved
@@ -65,8 +66,7 @@ def handle(infiles, tables, user_input_path, **kwargs):
# area_b is in square radians, so need to multiply by the earth_radius**2
ds[VAR_NAME] = earth_radius**2*area_b*ds[VAR_NAME]

setup_cmor(var_name=VAR_NAME, table_path=tables, table_name=TABLE,
user_input_path=user_input_path)
util.setup_cmor(var_name=VAR_NAME, table_path=tables, table_name=TABLE, user_input_path=user_input_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format using black

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can run black using pre-commit or directly with the black command.

@TonyB9000
Copy link
Contributor Author

TonyB9000 commented Aug 2, 2024

@tomvothecoder Question: If I have accepted a pile of "commit changes" here on the web, but also want to edit and push larger changes from my local repo, how do I avoid clobbering one or the other? Can I "pull" from my remote PR to update my local repo first, before advancing new edits?

NVMD - looks like "git pull" worked to fast-forward.

TonyB9000 and others added 3 commits August 2, 2024 16:45
…40+ modules to include logging module to accommodate, and addressed issues such as spurious newlines, or lack thereof.
…tups_and_logging' into consolidate_cmor_setups_and_logging
@TonyB9000
Copy link
Contributor Author

@tomvothecoder This should be VERY close now, to address your comments. I need to add the issue for "num_sucesses" vs "num_failures" in main.py, as this affects the conditions of exit. It may be necessary to make it another command-line option (exit on any failure, or return non-zero status on any failure, etc).

@TonyB9000
Copy link
Contributor Author

@tomvothecoder I think I've addressed all of your comments. Could you re-review the changes? Thanks!

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 26, 2024

Thank you @TonyB9000. I'm doing another review right now.

A few notes:

  1. Can you please paste the log output on main vs. with this PR? I would like to see what the log output looks like without needing to run e3sm_to_cmip on this branch.

  2. For future pull requests, can you make sure to split your changes into smaller, related changes? Combining many changes (related or unrelated) in a pull request makes it harder to review and test. The CMOR setups and logger changes should have been in separate PRs. Other example of changes that should have been in separate PRs:

    • delete_tempfiles flag in mpas.py
    • do_pbar command line flag
    • main.py: Added "num_failures" in parallel to "num_sucesses", to enable a non-zero return status.
  3. I'm not sure what the long-term implications of referencing/overriding the global logger from Python will be, if any. From my experience, I have not needed to do this nor have I seen it commonly done in Python packages (is it good practice?). If it seems to be working for e3sm_to_cmip for our use cases (no repeated log outputs, centralized logging), then I'm fine with it. This is just a forewarning in case something happens or it becomes unmaintainable/problematic.

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 26, 2024

3. I'm not sure what the long-term implications of referencing/overriding the global logger from Python will be, if any. From my experience, I have not needed to do this nor have I seen it commonly done in Python packages (is it good practice?). If it seems to be working for e3sm_to_cmip for our use cases (no repeated log outputs, centralized logging), then I'm fine with it. This is just a forewarning in case something happens or it becomes unmaintainable/problematic.

Adding to my comment above, mypy caught all of these issues related to the logger variable. I'm somewhat concerned with whether referencing the global logger is good practice (post about this). Please run pre-commit install in your e3sm_to_cmip repo and run pre-commit --run all-files to see these issues too.

e3sm_to_cmip/_logger.py:45: error: Cannot resolve name "logger" (possible cyclic definition)  [misc]
e3sm_to_cmip/_logger.py:45: note: Recursive types are not allowed at function scope
e3sm_to_cmip/util.py:24: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:502: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:512: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:528: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:552: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:561: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:571: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:645: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:649: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/util.py:662: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/cmor_handlers/handler.py:218: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:248: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:256: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:279: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:300: error: Item "None" of "Any | None" has no attribute "error"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:342: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:684: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:688: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:698: error: Item "None" of "Any | None" has no attribute "error"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:701: error: Item "None" of "Any | None" has no attribute "info"  [union-attr]
e3sm_to_cmip/cmor_handlers/handler.py:711: error: Item "None" of "Any | None" has no attribute "error"  [union-attr]
e3sm_to_cmip/cmor_handlers/utils.py:22: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/cmor_handlers/utils.py:22: error: Unexpected keyword argument "set_log_level" for "_logger"; did you mean "log_level"?  [call-arg]
e3sm_to_cmip/cmor_handlers/utils.py:83: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/cmor_handlers/utils.py:125: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/cmor_handlers/utils.py:221: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/cmor_handlers/utils.py:227: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/_logger.py:10: note: "_logger" defined here
e3sm_to_cmip/__main__.py:110: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:153: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:154: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:155: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:156: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:157: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:158: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:159: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:160: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:161: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:213: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:228: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:229: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:230: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:232: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:238: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:656: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:657: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:658: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:674: error: Unused "type: ignore" comment  [unused-ignore]
e3sm_to_cmip/__main__.py:700: error: Incompatible types in assignment (expression has type "dict[str, Any]", variable has type "str | None")  [assignment]
e3sm_to_cmip/__main__.py:708: error: Unsupported target for indexed assignment ("None")  [index]
e3sm_to_cmip/__main__.py:710: error: Unsupported target for indexed assignment ("None")  [index]
e3sm_to_cmip/__main__.py:716: error: Unused "type: ignore" comment  [unused-ignore]
e3sm_to_cmip/__main__.py:752: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:772: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:773: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:774: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:836: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:861: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:865: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:878: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:970: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:972: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:985: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:989: error: Name "logger" is not defined  [name-defined]
e3sm_to_cmip/__main__.py:990: error: Name "logger" is not defined  [name-defined]
Found 65 errors in 5 files (checked 6 source files)

I think an easier way to achieve a single global logger is to instantiate the logger once in _logger.py then import it to the various other modules.

Example (source):

You could make your own logging "module" which instantiates the logger, than have all of your code import that instead. Think:

logger.py:

import logging
log = logging.getLogger('')

codeA.py:

from logger import log
log.info('whatever')

codeB.py:

from logger import log
log.warn('some other thing')

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 26, 2024

I suggest we split up this PR into smaller PRs and merge the ones that are ready. I’m not 100% confident in the logging portion of this PR yet and think it can be done more cleanly/easily. It is also an enhancement and not a critical need.

- Add type annotations to `_logger._logger`
- Fix import order across modules using `isort`
@TonyB9000
Copy link
Contributor Author

@tomvothecoder Regarding your example for "achieve a single global logger", this was not my intent (although not a bad idea). I assumed different modules might want to configure logging differently (stream versus file, and propagate/not), and wanted a single and centralized call (found in _logger.py) to accomplish the work, based upon configuration choices. Also, I had not created the "_logger.py" module myself, and did not want to deviate too far from what I thought was its functionality.

The main changes were to reduce the two functions (one for "root logger" one not) into a single function, and to provide "only console, only logfile, or both" as options.

But I am certainly up for another approach. I did try to kill multiple birds with one stone - perhaps unwise.

(ASIDE: I found it strange to have "main" be reduced to a single class that encapsulates everything, reducing the entire "main()" function call to "app = " and then "return app.run()". Is this a common pattern?)

@TonyB9000
Copy link
Contributor Author

@tomvothecoder You asked: "Can you please paste the log output on main vs. with this PR?"

I can do this, but a trial for atmos, land or ocean may produce different output, sue to their differences in handling. I'll work this up and provide the output.

@tomvothecoder
Copy link
Collaborator

Regarding your example for "achieve a single global logger", this was not my intent (although not a bad idea). I assumed different modules might want to configure logging differently (stream versus file, and propagate/not), and wanted a single and centralized call (found in _logger.py) to accomplish the work, based upon configuration choices. Also, I had not created the "_logger.py" module myself, and did not want to deviate too far from what I thought was its functionality.

The main changes were to reduce the two functions (one for "root logger" one not) into a single function, and to provide "only console, only logfile, or both" as options.

I have a bit more time and will look into a solution to logging. I implemented the _logger.py module to replace the print statements being used before (e.g., the colored ones Sterling implemented).

(ASIDE: I found it strange to have "main" be reduced to a single class that encapsulates everything, reducing the entire "main()" function call to "app = " and then "return app.run()". Is this a common pattern?)

Can you clarify how you find this design pattern strange? From my experience, this is a common API design pattern for software programs. The core program is a class (E3SMtoCMIP in this case) that encapsulates related methods and attributes related to the e3sm_to_cmip "run". This design allows us to store the CLI configurations as Python class attributes that can be referenced throughout the class, rather than needing to pass them around constantly between standalone functions (like how it was before), resulting in a much cleaner codebase. Also read this page for testing and debugging using this API design, which wasn't possible before.

@tomvothecoder
Copy link
Collaborator

I suggest we split up this PR into smaller PRs and merge the ones that are ready. I’m not 100% confident in the logging portion of this PR yet and think it can be done more cleanly/easily. It is also an enhancement and not a critical need.

To split up this PR, you can:

  1. Create separate branches from this branch
  2. Revert the unrelated changes (e.g., keep only cmor setup changes)
  3. Open a PR for each branch

@TonyB9000
Copy link
Contributor Author

TonyB9000 commented Aug 29, 2024

@tomvothecoder @chengzhuzhang Having given this consideration, I am inclined to abandon this branch, create two new branches off of main (obtained by a fresh git-clone),

branch:  consolidate_cmor_setup_273
branch:  redesign_logger_274

and then - reviewing each logged commit - re-introduce only the "cmor setup" or only the "logfile" work upon each branch. That way, if any commit happened to "overlap" (contain both sorts of changes), they can be easily kept separate.

We can then test these changes separately as well.

This seems the cleanest (and safest) approach, compared to branching off of a "muddy" branch and trying to selectively revert.

@tomvothecoder
Copy link
Collaborator

We can then test these changes separately as well.

This seems the cleanest (and safest) approach, compared to branching off of a "muddy" branch and trying to selectively revert.

Sounds good with me. Thank you Tony.

@TonyB9000
Copy link
Contributor Author

@tomvothecoder @chengzhuzhang I am closing this issue/PR as abandoned, since it has been split into issue #275 (Consolidate cmor setup, since merged to master) and issue #274 (Redesign logging, soon to be PR'd).

@TonyB9000 TonyB9000 closed this Sep 12, 2024
@TonyB9000 TonyB9000 deleted the consolidate_cmor_setups_and_logging branch September 12, 2024 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

e3sm_to_cmip --help creates a logs directory
3 participants