Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use HTML to present warnings of train_test_split when in notebooks #1060

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions skore/src/skore/sklearn/train_test_split/train_test_split.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

from __future__ import annotations

import contextlib
import re
import warnings
from typing import TYPE_CHECKING, Any, Optional, Union

Expand All @@ -10,12 +12,56 @@

from skore.sklearn.find_ml_task import _find_ml_task
from skore.sklearn.train_test_split.warning import TRAIN_TEST_SPLIT_WARNINGS
from skore.utils._environment import is_environment_html_capable

if TYPE_CHECKING:
ArrayLike = Any
from skore.project import Project


# use variables for HTML warnings
# pick vscode color then fallback to jupyter color
# then fallback to orange
HTML_WARNING_TEMPLATE = """
<style>
.warning {{
--skore-warn-color: var(
--vscode-editorWarning-foreground,
var(--jp-warn-color0, light-dark(black, white))
);
--skore-warn-border-color: var(
--vscode-editorWarning-border,
var(--jp-warn-color2, #f08b30)
);
padding: 0.2em 1em;
border: solid 1px var(--skore-warn-border-color);
margin-bottom: 1px;
background-color: transparent;
color: var(--skore-warn-color);
font-family: inherit;

.title {{
font-size: 1.2em;
font-weight: bold;
margin: 1em 0;
}}

.content {{
font-size: 1em;
text-wrap: balance;
}}
}}
.jp-RenderedHTMLCommon .warning p {{
margin-bottom: 0;
}}
</style>
<div class="warning">
<div class="title">{warning_class}</div>
<div class="content">{warning}</div>
</div>
"""


def train_test_split(
*arrays: ArrayLike,
X: Optional[ArrayLike] = None,
Expand Down Expand Up @@ -158,10 +204,36 @@ class labels.
ml_task=ml_task,
)

to_display = []
for warning_class in TRAIN_TEST_SPLIT_WARNINGS:
warning = warning_class.check(**kwargs)

if warning is not None:
to_display.append((warning, warning_class))

if is_environment_html_capable():
with contextlib.suppress(ImportError):
from IPython.core.interactiveshell import InteractiveShell
from IPython.display import HTML, display
from markdown import markdown

if InteractiveShell.initialized():
markup = "".join(
[
HTML_WARNING_TEMPLATE.format(
warning=markdown(warning),
warning_class=re.sub(
"([A-Z][a-z]+)",
r" \1",
re.sub("([A-Z]+)", r" \1", warning_class.__name__),
).lstrip(),
)
for warning, warning_class in to_display
]
)
display(HTML(markup))
else:
for warning, warning_class in to_display:
warnings.warn(message=warning, category=warning_class, stacklevel=1)
Comment on lines +214 to 237
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative that would work in both HTML and in prompt would be to leverage rich:

from skore import console  # avoid circular import
    from rich.panel import Panel

    for warning_class in TRAIN_TEST_SPLIT_WARNINGS:
        warning = warning_class.check(**kwargs)

        if warning is not None:
            # Only check if warning is filtered/ignored
            if not warnings.filters or not any(
                f[0] == "ignore" and f[2] == warning_class for f in warnings.filters
            ):
                console.print(
                    Panel(
                        title=warning_class.__name__,
                        renderable=warning,
                        style="orange1",
                        border_style="cyan",
                    )
                )

It would provide the following outputs:

image

And here it is configure such that someone filter the warnings with ignore, then we don't show it.

It might be a bit less invasive than checking for the environment (while this code could be useful for some other stuff along the road).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using rich everywhere does provide coherence (with the EstimatorReport of #997)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can rich display False with code style for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can rich display False with code style for example?

Yes rich is capable to display markdown.

I hadn't considered using pure textual, it's cool and lightweight. Only down side is that we cant match hosting styles.
I'm happy to refactor that way. @MarieS-WiMLDS can you please settle this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by matching hosting style? Is it light/dark theme, or is it about the font?
I really don't mind if we don't have the default font of the user, it's more troublesome if it's about the theme, because it might affect readability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An note that we have a similar limitation with the logging when creating project, etc. where we use rich.

Copy link
Contributor

@sylvaincom sylvaincom Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your insights, anyway an orange background seems anti-UX, let's move on with rich. Should we close this PR and open a new one for rich? How do you wish to proceed with rich?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Sylvain, very unlikely for a user to have a orange background!
The "we can do it later" stuff tends to be never done, especially when it's about having clean code, so I'd prefer that we go for rich before merging, if it's compatible with your deadline of tuesday evening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cf. #1086

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're the best ⚡️


return output
52 changes: 52 additions & 0 deletions skore/src/skore/utils/_environment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import os
import sys


def get_environment_info():
"""Detect the current Python execution environment.

Returns a dictionary with information about the environment.
"""
env_info = {
"is_jupyter": False,
"is_vscode": False,
"is_interactive": False,
"environment_name": "standard_python",
"details": {},
}

# Check for interactive mode
env_info["is_interactive"] = hasattr(sys, "ps1")

# Check for Jupyter
try:
shell = get_ipython().__class__.__name__ # type: ignore
env_info["details"]["ipython_shell"] = shell

if shell == "ZMQInteractiveShell": # Jupyter notebook/lab
env_info["is_jupyter"] = True
env_info["environment_name"] = "jupyter"
elif shell == "TerminalInteractiveShell": # IPython terminal
env_info["environment_name"] = "ipython_terminal"
except NameError:
pass

# Check for VSCode
if "VSCODE_PID" in os.environ:
env_info["is_vscode"] = True
if env_info["is_interactive"]:
env_info["environment_name"] = "vscode_interactive"
else:
env_info["environment_name"] = "vscode_script"

# Add additional environment details
env_info["details"]["python_executable"] = sys.executable
env_info["details"]["python_version"] = sys.version

return env_info


def is_environment_html_capable() -> bool:
"""Return `True` if the execution context dcan render HTML. `False` otherwise."""
info = get_environment_info()
return info["is_vscode"] or info["is_jupyter"]
Loading