Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mark lazy mode deprecated and add logger.lazy() for lazy format value #1233

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

trim21
Copy link
Contributor

@trim21 trim21 commented Nov 12, 2024

No description provided.

@trim21
Copy link
Contributor Author

trim21 commented Nov 12, 2024

ci failed at codecov, not test/lint

@Delgan
Copy link
Owner

Delgan commented Nov 20, 2024

Hi @trim21, thanks for opening a PR. I understand it refers to #1207.

I've been thinking about this lately, and to be honest I'm not very keen on the idea adding a new LazyValue class just to optimize lazy evaluations of message formatting. So far, the logger has been the only publicly importable component from loguru. That would be a significant change of this idiom for what seems to be a relatively small inconvenience.

I would welcome a new API for lazy evaluated arguments if it also address #782. However, this is not necessarily easy.

Another thing to consider is PEP 750. It's still draft and there's no guarantee that it will be accepted, but it could be interesting for our use case, without requiring a new API.

@trim21
Copy link
Contributor Author

trim21 commented Nov 20, 2024

maybe we can add a api logger.lazy(fn, *args, **kwargs)?

I don't think PEP 750 will help #782 or this, even with pep750 accepted, value in template str is still calculcated before logger function is called.

@trim21
Copy link
Contributor Author

trim21 commented Nov 20, 2024

also we have only sync api, I don't know there is a method to handle #782

@Delgan
Copy link
Owner

Delgan commented Nov 20, 2024

value in template str is still calculcated before logger function is called.

Indeed, but it's possible to pass a lambda: Approaches to Lazy Evaluation

We could imagine than when a template string is used, then Loguru will convert callable arguments, but only if the log level requires it. This would make the opt(lazy=true) argument obsolete, while making it straightforward to combine with non-lazy arguments.

This PEP is actively discussed here.

maybe we can add a api logger.lazy(fn, *args, **kwargs)?

I would prefer such API, yes. But still, that's a new method that increases the complexity of Loguru's API, for a marginal gain in my opinion.

also we have only sync api, I don't know there is a method to handle #782

Yes, unfortunately I don't know how to integrate it either.

@trim21
Copy link
Contributor Author

trim21 commented Nov 20, 2024

We could imagine than when a template string is used, then Loguru will convert callable arguments, but only if the log level requires it.

personal I think this is confusing and dangerous behaviour to call callable in arguments unless explicit specified with lazy or something

@trim21 trim21 changed the title mark lazy mode is deprecated and add LazyValue for lazy format value mark lazy mode deprecated and add LazyValue for lazy format value Nov 25, 2024
@trim21
Copy link
Contributor Author

trim21 commented Nov 25, 2024

what do you think about a new public API loguru.utils.LazyValue/Lazy/lazy instead of loguru.LazyValue?

even with PEP 750 accepted I think the api should be logger.debug(t'hello {Lazy(fn, ...)}') or logger.debug(t'hello {logger.lazy(fn, ...)}')

@trim21 trim21 changed the title mark lazy mode deprecated and add LazyValue for lazy format value mark lazy mode deprecated and add logger.lazy() for lazy format value Nov 25, 2024
@Delgan
Copy link
Owner

Delgan commented Nov 26, 2024

Thanks for the update.

Sorry, I should have catch that earlier but I realized there are two mains problems with such API:

  • The expensive function will be called multiple time if the argument is to be formatted multiple times (e.g. logger.info("{foo} {foo}", foo=logger.lazy(func))).
  • The extra dict which be populated with a LazyValue instead of the underlying value (for structured logging).

Even if we managed to get around these problems, to be honest I'm still not convinced by such an API change.

We're trying to fix a mere inconvenience. But there are other patterns that are not possible today and that it would be more interesting to handle. Notably #782 as said, or even:

# There is no equivalent possible in Loguru.
if logger.isEnabledFor(logging.DEBUG):
    foo = expensive_func() 
    logger.info("Foo: %s", foo.summary)
    logger.debug("Foo details: %s", foo.details)

@trim21
Copy link
Contributor Author

trim21 commented Nov 26, 2024

I agree we should have a method for isEnabledFor.

These problems are easy to fix, we can just cache result and modify json encoder's default argument we used

{"text": "2024-11-27 03:32:58.508 | INFO     | __main__:<module>:13 - 1 1\n", "record": {"elapsed": {"repr": "0:00:00.002001", "seconds": 0.002001}, "exception": null, "extra": {"foo": "1"}, "file": {"name": "a.py", "path": "C:\\Users\\Trim21\\proj\\loguru\\a.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 13, "message": "1 1", "module": "a", "name": "__main__", "process": {"id": 7940, "name": "MainProcess"}, "thread": {"id": 27388, "name": "MainThread"}, "time": {"repr": "2024-11-27 03:32:58.508597+08:00", "timestamp": 1732649578.508597}}}
class LazyValue:
    def __init__(self, fn, *args, **kwargs):
        self.fn = fn
        self.args = args
        self.kwargs = kwargs

    def __format__(self, format_spec: str):
        return format(self.__result, format_spec)

    def __str__(self):
        return str(self.__result)

    def __repr__(self):
        return repr(self.__result)

    @functools.cached_property
    def __result(self):
        return self.fn(*self.args, **self.kwargs)

@Delgan
Copy link
Owner

Delgan commented Dec 3, 2024

Yes, I think a caching system is essential.

I was thinking again about this proposition the other day, and I believe we might be able to make it compatible with #782. Specifically, if the function to wrap is a coroutine, we could likely use create_task() (as is already done for async sinks) to delay the logging call internally.

It’s possible that the object returned by logger.lazy() should be awaitable (similar to logger.complete()). I’m not certain yet, as I haven’t experimented with this approach.

Additionally, I recalled another improvement I’d like to implement: lazy arguments should not be evaluated if the filter of all handlers returns False. This seems important, given that there are use cases where the handler level is set to 0 and everything else is handled in a custom filter function.

With these two elements in mind, there’s plenty to consider regarding the final API, especially in terms of when and how to perform lazy evaluation. That said, I think it’s doable.

I know this doesn’t directly address your initial concerns in #1207. However, as I mentioned, I’m trying to get the best out of this of this future new method. Of course, this isn’t something that needs to be addressed in this PR (which could be merged as 1st step). These are just ideas at this stage.

In any case, it seems feasible to include these changes in the next minor release.

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

I was thinking again about this proposition the other day, and I believe we might be able to make it compatible with #782. Specifically, if the function to wrap is a coroutine, we could likely use create_task() (as is already done for async sinks) to delay the logging call internally.

We can emit log to async sink with asyncio.create_task but it's not very doable with async LazyValue with sync sink, I think?

for a minimal case:

import asyncio
import functools


class LazyValue:
    def __init__(self, fn, *args, **kwargs):
        self.fn = fn
        self.args = args
        self.kwargs = kwargsasync def asink(s: str):
    print(s)

def asink(s: str):
    print(s)

def log(s, *args):
    how to get expected message hereasync def main():
    log("hello {}", LazyValue(asyncio.sleep, 1, "world"))


if __name__ == "__main__":
    asyncio.run(main())

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

Additionally, I recalled another improvement I’d like to implement: lazy arguments should not be evaluated if the filter of all handlers returns False. This seems important, given that there are use cases where the handler level is set to 0 and everything else is handled in a custom filter function.

This can be done with cache, maybe? we do not generate log message in _log but pass a class with @functools.cached_property with def message() -> str: ... and handlers will need to get log line from a cached function.

and if no handle try to get log line, lazy value will not be evaluated.

@Delgan
Copy link
Owner

Delgan commented Dec 3, 2024

We can emit log to async sink with asyncio.create_task but it's not very doable with async LazyValue with sync sink, I think?

Here is roughly what I had in mind:

import asyncio
import time

MIN_LEVEL = 20

tasks = []


class Lazy:
    def __init__(self, fn):
        self.fn = fn


def _sink(message, *args, **kwargs):
    print(message.format(*args, **kwargs))


async def _async_log(message: str, lazy: Lazy):
    _sink(message, lazy=await lazy.fn())


def _sync_log(message: str, lazy: Lazy):
    _sink(message, lazy=lazy.fn())


def log(level: int, message: str, lazy: Lazy):
    if level < MIN_LEVEL:
        return

    if asyncio.iscoroutinefunction(lazy.fn):
        task = asyncio.get_event_loop().create_task(_async_log(message, lazy))
        tasks.append(task)
    else:
        _sync_log(message, lazy)


async def _async_expensive():
    await asyncio.sleep(1)
    return "<Expensive async result>"


def _sync_expensive():
    time.sleep(1)
    return "<Expensive sync result>"


async def async_main():
    lazy = Lazy(_async_expensive)
    log(10, "Result async 1: {lazy}", lazy=lazy)
    log(30, "Result async 2: {lazy}", lazy=lazy)
    await asyncio.gather(*tasks)


def sync_main():
    lazy = Lazy(_sync_expensive)
    log(10, "Result sync 1: {lazy}", lazy=lazy)
    log(30, "Result sync 2: {lazy}", lazy=lazy)


if __name__ == "__main__":
    start = time.time()
    sync_main()
    asyncio.run(async_main())
    print(f"Total time: {time.time() - start}")

I didn't implement the caching mechanism, but it's trivial.

@Delgan
Copy link
Owner

Delgan commented Dec 3, 2024

This can be done with cache, maybe? we do not generate log message in _log but pass a class with @functools.cached_property with def message() -> str: ... and handlers will need to get log line from a cached function.

Yes, something along these lines. I was thinking maybe the Lazy could have a special __getattribute__ that lazily evaluate the attribute access and cache the result. But that's yet to be defined.

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

We can emit log to async sink with asyncio.create_task but it's not very doable with async LazyValue with sync sink, I think?

Here is roughly what I had in mind:

This doesn't look right to me...

for example, a logger with only sys.stdout/sys.stderr output, which is a sync sink, right?

a #782 example won't be able to write log message to it.

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

OK, I got it. we will call _async_log method if any func of Lazy is coroutine function

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

A more detail impl in mind:

class Lazy:
    def __init__(self, fn):
        self.fn = fn

    async def as_async_result(self):
        if asyncio.iscoroutinefunction(self.fn):
            return await self.fn()
        return self.fn()


def _sink(message: str):
    print(message)


async def _async_log(message: str, *args, **kwargs):
    _sink(await __get_real_log_line(message, *args, **kwargs))


def _sync_log(message: str, *args, **kwargs):
    _sink(message.format(*args, **kwargs))


async def __get_real_log_line(message: str, *args, **kwargs) -> str:
    # need to evaluate lazy value parallel here
    args = [(await v.as_async_result()) if isinstance(v, Lazy) else v for v in args]
    kwargs = {
        k: ((await v.as_async_result()) if isinstance(v, Lazy) else v) for k, v in kwargs.items()
    }
    return message.format(*args, **kwargs)


def is_async_lazy(*args, **kwargs):
    return any(asyncio.iscoroutinefunction(v.fn) for v in args if isinstance(v, Lazy)) or any(
        asyncio.iscoroutinefunction(v.fn) for v in kwargs.values() if isinstance(v, Lazy)
    )


def log(level: int, message: str, *args, **kwargs):
    if level < MIN_LEVEL:
        return

    if is_async_lazy(*args, **kwargs):
        task = asyncio.get_event_loop().create_task(_async_log(message, *args, **kwargs))
        tasks.append(task)
    else:
        _sync_log(message, *args, **kwargs)

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

🤔 we will also need to maintain a backgroud task group in logger in-case they don't get freeed

@trim21
Copy link
Contributor Author

trim21 commented Dec 3, 2024

So I think we can do these in each seprated PRs:

  1. mark lazy mode deprecated and add Lazy for sync fn (this one)
  2. add asyncio support.
  3. make lazy value evaluate after filter.

@Delgan
Copy link
Owner

Delgan commented Dec 4, 2024

There are also other possible API that would not require to test *args and **kwargs, for example:

logger.lazy(val=expensive_func).debug("My message: {val}")

But yes, once the API is fully defined, taking into account the technical aspect, it can be implemented in several PRs.

@trim21
Copy link
Contributor Author

trim21 commented Dec 6, 2024

Oh, I think this API looks better. maybe the best solution before PEP 750 is accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants