diff --git a/CHANGES.rst b/CHANGES.rst index 88e167d26..d501e39ee 100644 --- a/CHANGES.rst +++ b/CHANGES.rst @@ -36,6 +36,9 @@ Unreleased - The ``groupby`` filter is case-insensitive by default, matching other comparison filters. Added the ``case_sensitive`` parameter to control this. :issue:`1463` +- Autoescape can select different ``escape`` functions for different + templates. This allows escaping other types besides HTML. + :issue:`1377` Version 3.0.3 diff --git a/docs/api.rst b/docs/api.rst index 8b80749c3..37c2454a2 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -82,7 +82,7 @@ useful if you want to dig deeper into Jinja or :ref:`develop extensions .. autoclass:: Environment([options]) :members: from_string, get_template, select_template, get_or_select_template, join_path, extend, compile_expression, - compile_templates, list_templates, add_extension + compile_templates, list_templates, add_extension, get_markup_class .. attribute:: shared @@ -220,6 +220,98 @@ useful if you want to dig deeper into Jinja or :ref:`develop extensions :members: disable_buffering, enable_buffering, dump +.. _escaping: + +Safe Strings and Escaping +------------------------- +.. versionchanged:: 3.1 + +To handle untrusted input when rendering templates to +avoid injection attacks Jinja uses a combination of trusted strings +and escape functions. + +The general idea that values that can be trusted will passed as special +string subclass. Doing so it can be prevent that an input is escaped +multiple times and at the same time make sure, that using string +operation like ``%`` the original escaped string stays escaped, even +when unescaped string are thrown at it. + +Before Jinja 3.1 this was done by the hardcoded +:class:`markupsafe.Markup` class and +:func:`markupsafe.escape` function from the `MarkupSafe`_ package. +The ``escape(s: str)`` function converts the characters +``&``, ``<``, ``>``, ``'``, and ``"`` in string `s` +to HTML-safe sequences. It is intended to be used if you need to +display text that might contain such characters in HTML. + +The result of a call of the ``escape(s: str)`` function is a +``Markup`` class. +This class provides an ``.__html__()`` method, which is used internally +as an indicator that the string returned by this method is safe. +This way of defining a save/HTML string is also used by other Template +System or things like widget in Jupter Notebook. + +The class also overwrites a bunch of string methods and operators like: +``str.join()``, ``str.split()``, ``str.__add__()``, ``str.__mod__()`` etc. +This is done in a way so that the result of these operations +in combination with an raw strings is always an escaped ``Markup`` +class by using the ``escape`` method of the ``Markup`` class. + +With version 3.1 this hardcoded relation to the `MarkupSafe`_ and it's +HTML based escaping was removed, as Jinja is intended to be a Language +independent template system. +It is still the default but now you are able to provide a custom escape +function i.e. as parameter *default_escape* of :class:`Environment` or +as result of an *autoescape* call. + +So now you can write autoescaped templates for LaTeX or other languages. +See :ref:`autoescaping` for examples. +Please note that a safe string is still defined through the existence of +the ``__html__()`` method, no matter which extension or language is used! + + +.. admonition:: Attention + + Especially when using a custom escape function *never* use the + :class:`markupsafe.Markup` or :func:`markupsafe.escape` directly + to mark a string as safe or to escape it. + Instead use :meth:`Environment.get_markup_class` to get the + correct class. + + Usage:: + + Markup = env.get_markup_class("mytemplate.ext") + safe_str = Markup.escape("") + + If you write extensions, filters, etc., use the functions provided + by the :ref:`eval-context`. + + +This is required as the ``Markup`` class implements calls for its +``Markup.escape`` method i.e. when using the ``join`` or +the modulo ``%`` operator. +So it is important that the correct ``Markup`` subclass is used +always. If you hardcode the :class:`markupsafe.Markup` class or +:func:`markupsafe.escape` function either in +your application, :ref:`an extension `, +:ref:`custom filter` or +:ref:`custom test `, +it could have unintended side effects once custom escape functions are used! + + +The correct Markup class from a custom escape function is generated +using the helper class: + +.. + comment:: Somehow using the lru_cache wrapper the autodoc does not get the correct parameters + See also https://github.com/sphinx-doc/sphinx/issues/7650 + +.. autofunction:: jinja2.utils.get_wrapped_escape_class + +.. _MarkupSafe: https://markupsafe.palletsprojects.com/ + +.. _autoescaping: + Autoescaping ------------ @@ -232,6 +324,11 @@ future. It's recommended to configure a sensible default for autoescaping. This makes it possible to enable and disable autoescaping on a per-template basis (HTML versus text for instance). +.. versionchanged:: 3.1 + +Jinja now also allows the usage of different escape functions selected +by template suffix. + .. autofunction:: jinja2.select_autoescape Here a recommended setup that enables autoescaping for templates ending @@ -250,6 +347,8 @@ works roughly like this:: if template_name is None: return False if template_name.endswith(('.html', '.htm', '.xml')) + return True + return False When implementing a guessing autoescape function, make sure you also accept `None` as valid template name. This will be passed when generating @@ -259,6 +358,74 @@ defaults in the future might change. Inside the templates the behaviour can be temporarily changed by using the `autoescape` block (see :ref:`autoescape-overrides`). +Note that :func:`~jinja2.select_autoescape` offers also a parameter +`special_extensions` that allows you to define a custom escape function, +i.e. if you work with LaTeX files +(see description of :func:`~jinja.select_autoescape` above for example). + +To use your own custom escape function for one template extension +(i.e. ``*.tex``) you have to make sure that calling +``autoescape()`` returns +the desired custom escape function. +For instance if you are fan of peace in the world:: + + def escape_to_peace(s): + """ + Your custom escape function. You only have to take care + that your escaping is done properly, everything else like + preventing multiple escapes and marking the string as safe + is done by Jinja itself. + """ + return s.replace("war", "peace") + + env = Environment( + autoescape=select_autoescape(special_extensions={".world": escape_to_peace}), + loader=PackageLoader("mypackage"), + ) + + +Note that for ``.world`` files the ``{{ var|e }}`` and ``{{ var | escape }}`` +filters are replaced with the custom escape function. + +To mark a string as safe please use the :meth:`Environment.get_markup_class` +instead of direct :class:`markupsafe.Markup` calls:: + + + template = env.get_template("message_to_the.world") + # the content of the template is simply assumed to by + """ +

My Message to the world

+ {{ my_msg }} + I was replied with {{ reply }} + """ + # We know that everything that ends on world will use the special + # escape function + mark_safe = env.get_markup_class(".world") + my_msg = mark_safe("Make love not war!") + reply = "We want war!" + template.render(my_msg=my_msg, reply=reply) + + +.. admonition:: A word of caution + + Be aware that mixing files that use different custom escape + functions set by autoescape within of one render command, + can lead to unexpected behavior. + In general the ``{% include %}`` directive works fine but especially + ``{% extends %}`` commands can have unexpected outcomes as main template + overwrites the context of the included one. + That's why Jinja will raise an exception if ``{% extends %}`` is used + with different escape functions / Markup classes unless explicitly + allowed through the ``Environment(allow_mixed_escape_extends=True)`` + parameter. + + If possible always use the ``default_escape`` of the + :class:`~jinja2.Environment` class to define the mainly used escape + function / Markup class and use different environments + for different file types. Especially using ``{% extends %}`` + if you have to mix files with otherwise conflicting custom escape + settings. + .. _identifier-naming: @@ -636,6 +803,82 @@ Exceptions .. autoexception:: jinja2.TemplateAssertionError +.. _eval-context: + +Evaluation Context +------------------ + +The evaluation context (short eval context or eval ctx) makes it +possible to activate and deactivate compiled features at runtime. + +Currently it is only used to enable and disable automatic escaping, but +it can be used by extensions as well. + +The ``autoescape`` setting should be checked on the evaluation context, +not the environment. The evaluation context will have the computed value +for the current template. It also has the computed value for the +correct ``escape`` function and ``Markup`` class. + +Instead of ``pass_environment``: + +.. code-block:: python + + @pass_environment + def filter(env, value): + result = do_something(value) + + if env.autoescape: + result = env.get_markup_class()(result) + + return result + +Use ``pass_eval_context`` if you only need the setting: + +.. code-block:: python + + @pass_eval_context + def filter(eval_ctx, value): + result = do_something(value) + + if eval_ctx.autoescape: + result = eval_ctx.mark_safe(result) + + return result + +Or use ``pass_context`` if you need other context behavior as well: + +.. code-block:: python + + @pass_context + def filter(context, value): + result = do_something(value) + + if context.eval_ctx.autoescape: + result = context.eval_ctx.mark_safe(result) + + return result + +The evaluation context must not be modified at runtime. Modifications +must only happen with a :class:`nodes.EvalContextModifier` and +:class:`nodes.ScopedEvalContextModifier` from an extension, not on the +eval context object itself. + +.. autoclass:: jinja2.nodes.EvalContext + + .. attribute:: autoescape + + `True` or `False` depending on if autoescaping is active or not. + + .. attribute:: volatile + + `True` if the compiler cannot evaluate some expressions at compile + time. At runtime this should always be `False`. + + .. automethod:: get_escape_function + + .. automethod:: mark_safe + + .. _writing-filters: @@ -686,21 +929,20 @@ enabled before escaping the input and marking the output safe. import re from jinja2 import pass_eval_context - from markupsafe import Markup, escape @pass_eval_context def nl2br(eval_ctx, value): br = "
\n" if eval_ctx.autoescape: - value = escape(value) - br = Markup(br) + value = eval_ctx.get_escape_function()(value) + br = eval_ctx.mark_safe(br) result = "\n\n".join( f"

{br.join(p.splitlines())}<\p>" for p in re.split(r"(?:\r\n|\r(?!\n)|\n){2,}", value) ) - return Markup(result) if autoescape else result + return eval_ctx.mark_safe(result) if autoescape else result .. _writing-tests: @@ -757,77 +999,6 @@ being filtered the second argument. :class:`~jinja2.runtime.Context`. -.. _eval-context: - -Evaluation Context ------------------- - -The evaluation context (short eval context or eval ctx) makes it -possible to activate and deactivate compiled features at runtime. - -Currently it is only used to enable and disable automatic escaping, but -it can be used by extensions as well. - -The ``autoescape`` setting should be checked on the evaluation context, -not the environment. The evaluation context will have the computed value -for the current template. - -Instead of ``pass_environment``: - -.. code-block:: python - - @pass_environment - def filter(env, value): - result = do_something(value) - - if env.autoescape: - result = Markup(result) - - return result - -Use ``pass_eval_context`` if you only need the setting: - -.. code-block:: python - - @pass_eval_context - def filter(eval_ctx, value): - result = do_something(value) - - if eval_ctx.autoescape: - result = Markup(result) - - return result - -Or use ``pass_context`` if you need other context behavior as well: - -.. code-block:: python - - @pass_context - def filter(context, value): - result = do_something(value) - - if context.eval_ctx.autoescape: - result = Markup(result) - - return result - -The evaluation context must not be modified at runtime. Modifications -must only happen with a :class:`nodes.EvalContextModifier` and -:class:`nodes.ScopedEvalContextModifier` from an extension, not on the -eval context object itself. - -.. autoclass:: jinja2.nodes.EvalContext - - .. attribute:: autoescape - - `True` or `False` depending on if autoescaping is active or not. - - .. attribute:: volatile - - `True` if the compiler cannot evaluate some expressions at compile - time. At runtime this should always be `False`. - - .. _global-namespace: The Global Namespace diff --git a/docs/conf.py b/docs/conf.py index f65d462b3..321fef267 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -19,7 +19,10 @@ "sphinx_issues", ] autodoc_typehints = "description" -intersphinx_mapping = {"python": ("https://docs.python.org/3/", None)} +intersphinx_mapping = { + "python": ("https://docs.python.org/3/", None), + "markupsafe": ("https://markupsafe.palletsprojects.com/", None), +} issues_github_path = "pallets/jinja" # HTML ----------------------------------------------------------------- diff --git a/docs/faq.rst b/docs/faq.rst index 8584d7750..20a470339 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -95,8 +95,8 @@ means that you will less likely have an XSS problem it also causes a huge amount of extra processing in the template engine which can cause serious performance problems. As Python doesn't provide a way to mark strings as unsafe Jinja has to hack around that limitation by providing a custom -string class (the :class:`Markup` string) that safely interacts with safe -and unsafe strings. +string class that safely interacts with safe +and unsafe strings (see :ref:`escaping` for details). With explicit escaping however the template engine doesn't have to perform any safety checks on variables. Also a human knows not to escape integers diff --git a/docs/intro.rst b/docs/intro.rst index fd6f84ff5..d72045b28 100644 --- a/docs/intro.rst +++ b/docs/intro.rst @@ -48,7 +48,7 @@ Dependencies These will be installed automatically when installing Jinja. - `MarkupSafe`_ escapes untrusted input when rendering templates to - avoid injection attacks. + avoid injection attacks (see :ref:`escaping`). .. _MarkupSafe: https://markupsafe.palletsprojects.com/ diff --git a/docs/templates.rst b/docs/templates.rst index 9071ad671..bb96802f0 100644 --- a/docs/templates.rst +++ b/docs/templates.rst @@ -647,8 +647,9 @@ When automatic escaping is enabled, everything is escaped by default except for values explicitly marked as safe. Variables and expressions can be marked as safe either in: -a. The context dictionary by the application with - :class:`markupsafe.Markup` +a. The context dictionary by the application (see + ``Markup`` in :ref:`escaping`) + b. The template, with the ``|safe`` filter. If a string that you marked safe is passed through other Python code diff --git a/src/jinja2/compiler.py b/src/jinja2/compiler.py index 3458095f5..829b777d8 100644 --- a/src/jinja2/compiler.py +++ b/src/jinja2/compiler.py @@ -6,7 +6,6 @@ from itertools import chain from keyword import iskeyword as is_python_keyword -from markupsafe import escape from markupsafe import Markup from . import nodes @@ -127,7 +126,12 @@ def has_safe_repr(value: t.Any) -> bool: if value is None or value is NotImplemented or value is Ellipsis: return True - if type(value) in {bool, int, float, complex, range, str, Markup}: + if isinstance(value, Markup): + # If a custom Markup class is used, + # the value is a subclass of Markup + return True + + if type(value) in {bool, int, float, complex, range, str}: return True if type(value) in {tuple, list, set, frozenset}: @@ -725,6 +729,10 @@ def write_commons(self) -> None: self.writeline("resolve = context.resolve_or_missing") self.writeline("undefined = environment.undefined") self.writeline("concat = environment.concat") + self.writeline("escape = context.eval_ctx.get_escape_function()") + self.writeline("Markup = context.eval_ctx.mark_safe") + # Custom Wrappers have a different naming + self.writeline("MarkupWrapper = context.eval_ctx.mark_safe") # always use the standard Undefined class for the implicit else of # conditional expressions self.writeline("cond_expr_undefined = Undefined") @@ -872,7 +880,6 @@ def visit_Template( ) self.indent() self.write_commons() - # process the root frame = Frame(eval_ctx) if "self" in find_undeclared(node.body, ("self",)): @@ -996,7 +1003,7 @@ def visit_Extends(self, node: nodes.Extends, frame: Frame) -> None: # if we have a known extends we just add a template runtime # error into the generated code. We could catch that at compile - # time too, but i welcome it not to confuse users by throwing the + # time too, but I welcome it not to confuse users by throwing the # same error at different times just "because we can". if not self.has_known_extends: self.writeline("if parent_template is not None:") @@ -1012,7 +1019,7 @@ def visit_Extends(self, node: nodes.Extends, frame: Frame) -> None: self.writeline("parent_template = environment.get_template(", node) self.visit(node.template, frame) - self.write(f", {self.name!r})") + self.write(f", {self.name!r}, caller='extends')") self.writeline("for name, parent_block in parent_template.blocks.items():") self.indent() self.writeline("context.blocks.setdefault(name, []).append(parent_block)") @@ -1044,7 +1051,7 @@ def visit_Include(self, node: nodes.Include, frame: Frame) -> None: self.writeline(f"template = environment.{func_name}(", node) self.visit(node.template, frame) - self.write(f", {self.name!r})") + self.write(f", {self.name!r}, caller='include')") if node.ignore_missing: self.outdent() self.writeline("except TemplateNotFound:") @@ -1441,7 +1448,7 @@ def _output_child_to_const( const = node.as_const(frame.eval_ctx) if frame.eval_ctx.autoescape: - const = escape(const) + const = frame.eval_ctx.get_escape_function()(const) # Template data doesn't go through finalize. if isinstance(node, nodes.TemplateData): @@ -1693,7 +1700,9 @@ def visit_Concat(self, node: nodes.Concat, frame: Frame) -> None: for arg in node.nodes: self.visit(arg, frame) self.write(", ") - self.write("))") + self.write(")") + self.write(", mark_safe=context.eval_ctx.mark_safe") + self.write(")") @optimizeconst def visit_Compare(self, node: nodes.Compare, frame: Frame) -> None: diff --git a/src/jinja2/environment.py b/src/jinja2/environment.py index 9dd455a52..af7c3723f 100644 --- a/src/jinja2/environment.py +++ b/src/jinja2/environment.py @@ -9,8 +9,10 @@ from functools import lru_cache from functools import partial from functools import reduce +from inspect import isclass from types import CodeType +from markupsafe import escape as html_escape from markupsafe import Markup from . import nodes @@ -32,6 +34,7 @@ from .defaults import TRIM_BLOCKS from .defaults import VARIABLE_END_STRING from .defaults import VARIABLE_START_STRING +from .exceptions import TemplateConfigurationError from .exceptions import TemplateNotFound from .exceptions import TemplateRuntimeError from .exceptions import TemplatesNotFound @@ -48,6 +51,8 @@ from .utils import _PassArg from .utils import concat from .utils import consume +from .utils import EscapeFunc +from .utils import get_wrapped_escape_class from .utils import import_string from .utils import internalcode from .utils import LRUCache @@ -218,16 +223,72 @@ class Environment: ``None`` implicitly into an empty string here. `autoescape` - If set to ``True`` the XML/HTML autoescaping feature is enabled by - default. For more details about autoescaping see - :class:`~markupsafe.Markup`. As of Jinja 2.4 this can also + If set to ``True`` the autoescaping feature is enabled by + default. + + As of Jinja 2.4 this can also be a callable that is passed the template name and has to return ``True`` or ``False`` depending on autoescape should be enabled by default. + As of Jinja 3.1 the autoescape can be even smarter. + If the given function does not return a boolean but a + function again, this function is considered to be the + escape function that shall be used. So you can use the + same environment to autoescape LaTeX and HTML files. + + Use this with care. + Not all functions within Jinja get the context + that is required to determine the correct escape function. + If you run in trouble simply use different environments + with custom ``default_escape`` (see below) for + each file type. + + See :ref:`escaping` and :ref:`autoescaping` for details. + + .. versionchanged:: 3.1 + if the `autoescape` function doesn't return True or False but a + callable, it is assumed to be a custom escape function + .. versionchanged:: 2.4 `autoescape` can now be a function + `allow_mixed_escape_extends` + Allow that templates that extend each other can have different + escape functions / Markup classes. + + Note still only the escape function from the most parent + template will be used! + + See :ref:`autoescaping` for details. + + Defaults to False + + .. versionadded:: 3.1 + + `default_escape` + define a custom escape function or class. + + If a class is given it is assumed to be a subclass of + :class:`markupsafe.Markup`. + + If a function is given, it is assumed that this is an escape + function. The :func:`~jinja2.utils.get_wrapped_escape_class` + function will be used to generate a subclass of + :func:`markupsafe.escape` from the given escape function. + It also takes care that no already escaped strings are + escaped again. See :ref:`autoescaping` and :ref:`escaping`. + + Defaults to HTML based escape by :class:`markupsafe.Markup` + + The default value can be overwritten also by the ``autoescape`` + parameter if the result of ``autoescape(None)`` is a function. + + This setting will also overwrite the filter ``{{ var | safe }}``, + ``{{ var | e }}`` and ``{{ var | escape }}`` accordingly. + + .. versionadded:: 3.1 + `loader` The template loader for this environment. @@ -288,6 +349,7 @@ class Environment: context_class: t.Type[Context] = Context template_class: t.Type["Template"] + default_markup_class: t.Type[Markup] def __init__( self, @@ -313,6 +375,8 @@ def __init__( auto_reload: bool = True, bytecode_cache: t.Optional["BytecodeCache"] = None, enable_async: bool = False, + default_escape: t.Union[EscapeFunc, t.Type[Markup]] = html_escape, + allow_mixed_escape_extends: bool = False, ): # !!Important notice!! # The constructor accepts quite a few arguments that should be @@ -344,7 +408,16 @@ def __init__( self.optimized = optimized self.finalize = finalize self.autoescape = autoescape - + if isclass(default_escape): + default_escape = t.cast(t.Type[Markup], default_escape) + self.default_markup_class = default_escape + elif default_escape != html_escape: + self.default_markup_class = get_wrapped_escape_class(default_escape) + elif callable(self.autoescape) and callable(self.autoescape(None)): + self.default_markup_class = get_wrapped_escape_class(self.autoescape(None)) + else: + self.default_markup_class = Markup + self._allow_mixed_escape_extends: bool = allow_mixed_escape_extends # defaults self.filters = DEFAULT_FILTERS.copy() self.tests = DEFAULT_TESTS.copy() @@ -365,6 +438,25 @@ def __init__( self.is_async = enable_async _environment_config_check(self) + def get_markup_class(self, template_name: t.Optional[str] = None) -> t.Type[Markup]: + """ + Get the correct :class:`Markup` for the given template name. + + Use this instead of the default :class:`Markup` to mark a string as + safe, especially when using custom escaping. + + See :ref:`autoescaping` for an usage example. + + :param template_name: the name of the template that is checked + for special escpaing in the autoescape + settings + + .. versionadded:: 3.1 + """ + if callable(self.autoescape) and callable(self.autoescape(template_name)): + return get_wrapped_escape_class(self.autoescape(template_name)) + return self.default_markup_class + def add_extension(self, extension: t.Union[str, t.Type["Extension"]]) -> None: """Adds an extension after the environment was created. @@ -962,12 +1054,56 @@ def _load_template( self.cache[cache_key] = template return template + @internalcode + def _check_multi_template_autoescape( + self, + name: t.Union[None, str, "Template"], + parent: t.Optional[str], + caller: t.Optional[str], + ) -> None: + """ + Raise an Error if we expect different custom escape functions + or Markup classes used in a way that a user could get + unexpected results. + + :param name: The name of first template if known + :param parent: the name or names of the templates that + :param caller: the name of the function the + + :raise: TemplateConfigurationError + """ + # Currently we check only for extends and mixed auto escape, + # so we can exit early + if self._allow_mixed_escape_extends or caller != "extends": + return + + if isinstance(name, Template): + name = None + elif not isinstance(name, str): + raise ValueError( + f"Parameter name is not Template nor string. " + f"Given was '{name}' (Type: {type(name)})." + ) + base_class = self.get_markup_class(name) + parent_class = self.get_markup_class(parent) + + # The Simplest case it is exactly the same class + # We do not check for subclasses because a subclass could have + # altered the escape behavior already + if base_class != parent_class: + raise TemplateConfigurationError( + "You tried to extend a template with a different escape " + "function or Markup class as the base template. This has to be enabled" + "explicitly using Environment(allow_mixed_escape_extends=True)." + ) + @internalcode def get_template( self, name: t.Union[str, "Template"], parent: t.Optional[str] = None, globals: t.Optional[t.MutableMapping[str, t.Any]] = None, + caller: t.Optional[str] = None, ) -> "Template": """Load a template by name with :attr:`loader` and return a :class:`Template`. If the template does not exist a @@ -983,6 +1119,14 @@ def get_template( these extra variables available for all renders of this template. If the template has already been loaded and cached, its globals are updated with any new items. + :param caller: Tells get template which + function was calling it, i.e. 'extends' or 'include'. + Required to define behavior for custom autoescape. + + .. versionchanged:: 3.1 + Added caller parameter and a check if we need to raise an + error due to usage different autoescape function within + extends .. versionchanged:: 3.0 If a template is loaded from cache, ``globals`` will update @@ -992,6 +1136,7 @@ def get_template( If ``name`` is a :class:`Template` object it is returned unchanged. """ + self._check_multi_template_autoescape(name, parent, caller) if isinstance(name, Template): return name if parent is not None: @@ -1005,6 +1150,7 @@ def select_template( names: t.Iterable[t.Union[str, "Template"]], parent: t.Optional[str] = None, globals: t.Optional[t.MutableMapping[str, t.Any]] = None, + caller: t.Optional[str] = None, ) -> "Template": """Like :meth:`get_template`, but tries loading multiple names. If none of the names can be loaded a :exc:`TemplatesNotFound` @@ -1018,6 +1164,12 @@ def select_template( these extra variables available for all renders of this template. If the template has already been loaded and cached, its globals are updated with any new items. + :param caller: t.Optional[str] Tells get template which + function was calling it, i.e. 'extends' or 'include'. + Required to define behavior for custom autoescape. + + .. versionchanged:: 3.1 + Added caller parameter .. versionchanged:: 3.0 If a template is loaded from cache, ``globals`` will update @@ -1034,6 +1186,7 @@ def select_template( .. versionadded:: 2.3 """ + parent_template = parent if isinstance(names, Undefined): names._fail_with_undefined_error() @@ -1044,11 +1197,15 @@ def select_template( for name in names: if isinstance(name, Template): + self._check_multi_template_autoescape(name, parent_template, caller) return name if parent is not None: name = self.join_path(name, parent) try: - return self._load_template(name, globals) + template = self._load_template(name, globals) + # Only check autoescape if template can be loaded + self._check_multi_template_autoescape(name, parent_template, caller) + return template except (TemplateNotFound, UndefinedError): pass raise TemplatesNotFound(names) # type: ignore @@ -1061,17 +1218,35 @@ def get_or_select_template( ], parent: t.Optional[str] = None, globals: t.Optional[t.MutableMapping[str, t.Any]] = None, + caller: t.Optional[str] = None, ) -> "Template": """Use :meth:`select_template` if an iterable of template names is given, or :meth:`get_template` if one name is given. + :param template_name_or_list: List of template names to + try loading in order or Name of the template to load. + :param parent: The name of the parent template importing this + template. :meth:`join_path` can be used to implement name + transformations with this. + :param globals: Extend the environment :attr:`globals` with + these extra variables available for all renders of this + template. If the template has already been loaded and + cached, its globals are updated with any new items. + :param caller: t.Optional[str] Tells get template which + function was calling it, i.e. 'extends' or 'include'. + Required to define behavior for custom autoescape. + + .. versionchanged:: 3.1 + Added caller parameter + .. versionadded:: 2.3 """ if isinstance(template_name_or_list, (str, Undefined)): - return self.get_template(template_name_or_list, parent, globals) + return self.get_template(template_name_or_list, parent, globals, caller) elif isinstance(template_name_or_list, Template): + self._check_multi_template_autoescape(template_name_or_list, parent, caller) return template_name_or_list - return self.select_template(template_name_or_list, parent, globals) + return self.select_template(template_name_or_list, parent, globals, caller) def from_string( self, @@ -1521,15 +1696,15 @@ def __init__( " a template module. Use the async methods of the" " API you are using." ) - body_stream = list(template.root_render_func(context)) # type: ignore + self._context = context self._body_stream = body_stream self.__dict__.update(context.get_exported()) self.__name__ = template.name def __html__(self) -> Markup: - return Markup(concat(self._body_stream)) + return self._context.eval_ctx.mark_safe(concat(self._body_stream)) def __str__(self) -> str: return concat(self._body_stream) diff --git a/src/jinja2/exceptions.py b/src/jinja2/exceptions.py index 082ebe8f2..bc9566dd4 100644 --- a/src/jinja2/exceptions.py +++ b/src/jinja2/exceptions.py @@ -150,6 +150,12 @@ class TemplateRuntimeError(TemplateError): """ +class TemplateConfigurationError(TemplateError): + """An error to show that the environment was no configured correctly, + i.e. different escape function were used during extend + """ + + class UndefinedError(TemplateRuntimeError): """Raised if a template tries to operate on :class:`Undefined`.""" diff --git a/src/jinja2/ext.py b/src/jinja2/ext.py index d5550540c..025822a28 100644 --- a/src/jinja2/ext.py +++ b/src/jinja2/ext.py @@ -3,8 +3,6 @@ import re import typing as t -from markupsafe import Markup - from . import defaults from . import nodes from .environment import Environment @@ -172,7 +170,8 @@ def _make_new_gettext(func: t.Callable[[str], str]) -> t.Callable[..., str]: def gettext(__context: Context, __string: str, **variables: t.Any) -> str: rv = __context.call(func, __string) if __context.eval_ctx.autoescape: - rv = Markup(rv) + rv = t.cast(str, rv) + rv = __context.eval_ctx.mark_safe(rv) # Always treat as a format string, even if there are no # variables. This makes translation strings more consistent # and predictable. This requires escaping @@ -193,7 +192,8 @@ def ngettext( variables.setdefault("num", __num) rv = __context.call(func, __singular, __plural, __num) if __context.eval_ctx.autoescape: - rv = Markup(rv) + rv = t.cast(str, rv) + rv = __context.eval_ctx.mark_safe(rv) # Always treat as a format string, see gettext comment above. return rv % variables # type: ignore @@ -209,7 +209,8 @@ def pgettext( rv = __context.call(func, __string_ctx, __string) if __context.eval_ctx.autoescape: - rv = Markup(rv) + rv = t.cast(str, rv) + rv = __context.eval_ctx.mark_safe(rv) # Always treat as a format string, see gettext comment above. return rv % variables # type: ignore @@ -234,7 +235,8 @@ def npgettext( rv = __context.call(func, __string_ctx, __singular, __plural, __num) if __context.eval_ctx.autoescape: - rv = Markup(rv) + rv = t.cast(str, rv) + rv = __context.eval_ctx.mark_safe(rv) # Always treat as a format string, see gettext comment above. return rv % variables # type: ignore diff --git a/src/jinja2/filters.py b/src/jinja2/filters.py index 7e0970988..e32103219 100644 --- a/src/jinja2/filters.py +++ b/src/jinja2/filters.py @@ -8,9 +8,7 @@ from itertools import chain from itertools import groupby -from markupsafe import escape -from markupsafe import Markup -from markupsafe import soft_str +import markupsafe from .async_utils import async_variant from .async_utils import auto_aiter @@ -35,7 +33,13 @@ class HasHTML(te.Protocol): def __html__(self) -> str: - pass + ... + + def __add__(self, other: t.Union["HasHTML", str]) -> "HasHTML": + ... + + def splitlines(self) -> t.Iterable["HasHTML"]: + ... F = t.TypeVar("F", bound=t.Callable[..., t.Any]) @@ -43,6 +47,19 @@ def __html__(self) -> str: V = t.TypeVar("V") +@pass_eval_context +def do_escape(eval_ctx: "EvalContext", s: t.Union[str, "HasHTML"]) -> markupsafe.Markup: + """ + Escape a string with the escape function active in the current + eval context + + .. versionadded:: 3.1 + replaced the hard coded HTML :func:`markupsafe.escape` function + with an context aware escape function + """ + return eval_ctx.get_escape_function()(s) + + def ignore_case(value: V) -> V: """For use as a postprocessor for :func:`make_attrgetter`. Converts strings to lowercase and returns other types as-is.""" @@ -133,12 +150,19 @@ def _prepare_attribute_parts( return [attr] -def do_forceescape(value: "t.Union[str, HasHTML]") -> Markup: - """Enforce HTML escaping. This will probably double escape variables.""" +@pass_eval_context +def do_forceescape( + eval_ctx: "EvalContext", value: "t.Union[str, HasHTML]" +) -> markupsafe.Markup: + """ + Enforce HTML escaping. This will probably double escape variables. + + .. versionchanged:: 3.1 + made function context aware to use context based escape filter + """ if hasattr(value, "__html__"): value = t.cast("HasHTML", value).__html__() - - return escape(str(value)) + return do_escape(eval_ctx, str(value)) def do_urlencode( @@ -174,7 +198,11 @@ def do_urlencode( @pass_eval_context def do_replace( - eval_ctx: "EvalContext", s: str, old: str, new: str, count: t.Optional[int] = None + eval_ctx: "EvalContext", + s: t.Union[str, "HasHTML"], + old: str, + new: str, + count: t.Optional[int] = None, ) -> str: """Return a copy of the value with all occurrences of a substring replaced with a new one. The first argument is the substring @@ -189,6 +217,9 @@ def do_replace( {{ "aaaaargh"|replace("a", "d'oh, ", 2) }} -> d'oh, d'oh, aaargh + + .. versionchanged:: 3.1 + made function context aware to use context based escape filter """ if count is None: count = -1 @@ -201,21 +232,30 @@ def do_replace( or hasattr(new, "__html__") and not hasattr(s, "__html__") ): - s = escape(s) + s = do_escape(eval_ctx, s) else: - s = soft_str(s) + s = markupsafe.soft_str(s) - return s.replace(soft_str(old), soft_str(new), count) + # Special case, if user uses Markup class directly to mark + # something as safe but uses custom escape function + if ( + hasattr(s, "__html__") + and s.__class__ != eval_ctx.mark_safe("").__class__ != s.__class__ + ): + s = t.cast("HasHTML", s) + s = eval_ctx.mark_safe(s.__html__()) + + return s.replace(markupsafe.soft_str(old), markupsafe.soft_str(new), count) def do_upper(s: str) -> str: """Convert a value to uppercase.""" - return soft_str(s).upper() + return markupsafe.soft_str(s).upper() def do_lower(s: str) -> str: """Convert a value to lowercase.""" - return soft_str(s).lower() + return markupsafe.soft_str(s).lower() def do_items(value: t.Union[t.Mapping[K, V], Undefined]) -> t.Iterator[t.Tuple[K, V]]: @@ -274,6 +314,10 @@ def do_xmlattr( As you can see it automatically prepends a space in front of the item if the filter returned something unless the second parameter is false. """ + # even when we assume that the user wants the HTML escape here + # we give them the chance to use a custum function to escape + # even more + escape = eval_ctx.get_escape_function() rv = " ".join( f'{escape(key)}="{escape(value)}"' for key, value in d.items() @@ -284,7 +328,11 @@ def do_xmlattr( rv = " " + rv if eval_ctx.autoescape: - rv = Markup(rv) + # We don't assume that a user using this function wants to use + # a custom escape function. But for the sake of completeness we + # use the custom mark safe function from the eval context here as + # well + rv = eval_ctx.mark_safe(rv) return rv @@ -293,7 +341,7 @@ def do_capitalize(s: str) -> str: """Capitalize a value. The first character will be uppercase, all others lowercase. """ - return soft_str(s).capitalize() + return markupsafe.soft_str(s).capitalize() _word_beginning_split_re = re.compile(r"([-\s({\[<]+)") @@ -306,7 +354,7 @@ def do_title(s: str) -> str: return "".join( [ item[0].upper() + item[1:].lower() - for item in _word_beginning_split_re.split(soft_str(s)) + for item in _word_beginning_split_re.split(markupsafe.soft_str(s)) if item ] ) @@ -560,6 +608,9 @@ def sync_do_join( {{ users|join(', ', attribute='username') }} + .. versionchanged:: 3.1 + made function context aware to use context based escape filter + .. versionadded:: 2.6 The `attribute` parameter was added. """ @@ -574,23 +625,23 @@ def sync_do_join( # if any of the items has. If yes we do a coercion to Markup if not hasattr(d, "__html__"): value = list(value) - do_escape = False + perform_escape = False for idx, item in enumerate(value): if hasattr(item, "__html__"): - do_escape = True + perform_escape = True else: value[idx] = str(item) - if do_escape: - d = escape(d) + if perform_escape: + d = do_escape(eval_ctx, d) else: d = str(d) return d.join(value) # no html involved, to normal joining - return soft_str(d).join(map(soft_str, value)) + return markupsafe.soft_str(d).join(map(markupsafe.soft_str, value)) @async_variant(sync_do_join) # type: ignore @@ -605,7 +656,7 @@ async def do_join( def do_center(value: str, width: int = 80) -> str: """Centers the value in a field of a given width.""" - return soft_str(value).center(width) + return markupsafe.soft_str(value).center(width) @pass_environment @@ -734,6 +785,9 @@ def do_urlize( ``env.policies["urlize.extra_schemes"]``, which defaults to no extra schemes. + .. versionchanged:: 3.1 + made function context aware to use context based escape filter + .. versionchanged:: 3.0 The ``extra_schemes`` parameter was added. @@ -773,16 +827,22 @@ def do_urlize( rel=rel, target=target, extra_schemes=extra_schemes, + do_escape=eval_ctx.get_escape_function(), ) if eval_ctx.autoescape: - rv = Markup(rv) + rv = eval_ctx.mark_safe(rv) return rv +@pass_eval_context def do_indent( - s: str, width: t.Union[int, str] = 4, first: bool = False, blank: bool = False + eval_ctx: "EvalContext", + s: t.Union[str, "HasHTML"], + width: t.Union[int, str] = 4, + first: bool = False, + blank: bool = False, ) -> str: """Return a copy of the string with each line indented by 4 spaces. The first line and blank lines are not indented by default. @@ -791,6 +851,9 @@ def do_indent( :param first: Don't skip indenting the first line. :param blank: Don't skip indenting empty lines. + .. versionchanged:: 3.1 + made function context aware to use context based escape filter + .. versionchanged:: 3.0 ``width`` can be a string. @@ -806,12 +869,17 @@ def do_indent( newline = "\n" - if isinstance(s, Markup): - indention = Markup(indention) - newline = Markup(newline) + if hasattr(s, "__html__"): + indention = eval_ctx.mark_safe(indention) + newline = eval_ctx.mark_safe(newline) + # Make sure the correct Markup class is used + s = t.cast("HasHTML", s) + s = eval_ctx.mark_safe(s.__html__()) s += newline # this quirk is necessary for splitlines method + # Markup is a subclass of string, so let's just assume it is a string + s = t.cast(str, s) if blank: rv = (newline + indention).join(s.splitlines()) else: @@ -938,7 +1006,7 @@ def do_wordwrap( def do_wordcount(s: str) -> int: """Count the words in that string.""" - return len(_word_re.findall(soft_str(s))) + return len(_word_re.findall(markupsafe.soft_str(s))) def do_int(value: t.Any, default: int = 0, base: int = 10) -> int: @@ -999,20 +1067,24 @@ def do_format(value: str, *args: t.Any, **kwargs: t.Any) -> str: "can't handle positional and keyword arguments at the same time" ) - return soft_str(value) % (kwargs or args) + return markupsafe.soft_str(value) % (kwargs or args) def do_trim(value: str, chars: t.Optional[str] = None) -> str: """Strip leading and trailing characters, by default whitespace.""" - return soft_str(value).strip(chars) + return markupsafe.soft_str(value).strip(chars) -def do_striptags(value: "t.Union[str, HasHTML]") -> str: +@pass_eval_context +def do_striptags(eval_ctx: "EvalContext", value: "t.Union[str, HasHTML]") -> str: """Strip SGML/XML tags and replace adjacent whitespace by one space.""" if hasattr(value, "__html__"): value = t.cast("HasHTML", value).__html__() - return Markup(str(value)).striptags() + # Even so we don't ever expect a custom escape non HTML + # function to be used here, we use the eval_ctx mark_safe + # function for consistency + return eval_ctx.mark_safe(str(value)).striptags() def sync_do_slice( @@ -1329,11 +1401,12 @@ async def do_list(value: "t.Union[t.AsyncIterable[V], t.Iterable[V]]") -> "t.Lis return await auto_to_list(value) -def do_mark_safe(value: str) -> Markup: +@pass_eval_context +def do_mark_safe(eval_ctx: "EvalContext", value: str) -> markupsafe.Markup: """Mark the value as safe which means that in an environment with automatic escaping enabled this variable will not be escaped. """ - return Markup(value) + return eval_ctx.mark_safe(value) def do_mark_unsafe(value: str) -> str: @@ -1662,7 +1735,7 @@ async def do_rejectattr( @pass_eval_context def do_tojson( eval_ctx: "EvalContext", value: t.Any, indent: t.Optional[int] = None -) -> Markup: +) -> markupsafe.Markup: """Serialize an object to a string of JSON, and mark it safe to render in HTML. This filter is only for use in HTML documents. @@ -1685,7 +1758,7 @@ def do_tojson( kwargs = kwargs.copy() kwargs["indent"] = indent - return htmlsafe_json_dumps(value, dumps=dumps, **kwargs) + return htmlsafe_json_dumps(value, eval_ctx.mark_safe, dumps=dumps, **kwargs) def prepare_map( @@ -1792,8 +1865,8 @@ async def async_select_or_reject( "d": do_default, "default": do_default, "dictsort": do_dictsort, - "e": escape, - "escape": escape, + "e": do_escape, + "escape": do_escape, "filesizeformat": do_filesizeformat, "first": do_first, "float": do_float, @@ -1823,7 +1896,7 @@ async def async_select_or_reject( "selectattr": do_selectattr, "slice": do_slice, "sort": do_sort, - "string": soft_str, + "string": markupsafe.soft_str, "striptags": do_striptags, "sum": do_sum, "title": do_title, diff --git a/src/jinja2/nodes.py b/src/jinja2/nodes.py index b2f88d9d9..9927fc2ce 100644 --- a/src/jinja2/nodes.py +++ b/src/jinja2/nodes.py @@ -69,18 +69,53 @@ def __new__(mcs, name, bases, d): # type: ignore class EvalContext: """Holds evaluation time information. Custom attributes can be attached to it in extensions. + + .. versionchanged:: 3.1 + + - Added ``get_escape_function`` and ``mark_safe`` functions + - allow autoescape to be not only boolean but also an + escape function """ def __init__( self, environment: "Environment", template_name: t.Optional[str] = None ) -> None: self.environment = environment + if callable(environment.autoescape): - self.autoescape = environment.autoescape(template_name) + self.autoescape = bool(environment.autoescape(template_name)) else: - self.autoescape = environment.autoescape + self.autoescape = bool(environment.autoescape) self.volatile = False + # We need to keep Markup Class if existing as autoescape can be + # overwritten by {% autoescape %} environment. + self._markup_class: t.Type["Markup"] = self.environment.get_markup_class( + template_name + ) + + def get_escape_function(self) -> t.Callable[[t.Any], "Markup"]: + """ + return the currently valid escape function + + .. versionadded:: 3.1 + + """ + return self._markup_class.escape + + def mark_safe(self, input: str) -> "Markup": + """ + Mark a string as safe by creating a Markup class + + use this function instead of direct calls to Markup + if possible so custom escape functions + are correctly handled by the Markup class. + + .. versionadded:: 3.1 + + """ + return self._markup_class(input) + def save(self) -> t.Mapping[str, t.Any]: return self.__dict__.copy() @@ -614,7 +649,7 @@ def as_const(self, eval_ctx: t.Optional[EvalContext] = None) -> str: if eval_ctx.volatile: raise Impossible() if eval_ctx.autoescape: - return Markup(self.data) + return eval_ctx.mark_safe(self.data) return self.data @@ -1086,7 +1121,7 @@ class MarkSafe(Expr): def as_const(self, eval_ctx: t.Optional[EvalContext] = None) -> Markup: eval_ctx = get_eval_context(self, eval_ctx) - return Markup(self.expr.as_const(eval_ctx)) + return eval_ctx.mark_safe(self.expr.as_const(eval_ctx)) class MarkSafeIfAutoescape(Expr): @@ -1107,7 +1142,7 @@ def as_const( raise Impossible() expr = self.expr.as_const(eval_ctx) if eval_ctx.autoescape: - return Markup(expr) + return eval_ctx.mark_safe(expr) return expr diff --git a/src/jinja2/runtime.py b/src/jinja2/runtime.py index 985842b28..89b5b3430 100644 --- a/src/jinja2/runtime.py +++ b/src/jinja2/runtime.py @@ -5,9 +5,7 @@ from collections import abc from itertools import chain -from markupsafe import escape # noqa: F401 -from markupsafe import Markup -from markupsafe import soft_str +import markupsafe from .async_utils import auto_aiter from .async_utils import auto_await # noqa: F401 @@ -17,6 +15,7 @@ from .nodes import EvalContext from .utils import _PassArg from .utils import concat +from .utils import EscapeFunc from .utils import internalcode from .utils import missing from .utils import Namespace # noqa: F401 @@ -41,15 +40,17 @@ def __call__( ... +html_escape = markupsafe.escape + # these variables are exported to the template runtime exported = [ "LoopContext", "TemplateReference", "Macro", - "Markup", "TemplateRuntimeError", "missing", - "escape", + "concat", + "html_escape", "markup_join", "str_join", "identity", @@ -72,19 +73,33 @@ def identity(x: V) -> V: return x -def markup_join(seq: t.Iterable[t.Any]) -> str: - """Concatenation that escapes if necessary and converts to string.""" +def markup_join( + seq: t.Iterable[t.Any], mark_safe: EscapeFunc = markupsafe.Markup +) -> t.Union[str, markupsafe.Markup]: + """ + Concatenation that escapes if necessary and converts to string. + + .. versionchanged:: 3.1 + added optional parameter escape_function to make + use the context based escape function + """ buf = [] - iterator = map(soft_str, seq) + iterator = map(markupsafe.soft_str, seq) for arg in iterator: buf.append(arg) if hasattr(arg, "__html__"): - return Markup("").join(chain(buf, iterator)) + return mark_safe("").join(chain(buf, iterator)) return concat(buf) -def str_join(seq: t.Iterable[t.Any]) -> str: - """Simple args to string conversion and concatenation.""" +def str_join(seq: t.Iterable[t.Any], mark_safe: EscapeFunc = markupsafe.Markup) -> str: + """ + Simple args to string conversion and concatenation. + + .. versionchanged:: 3.1 + added optional and currently ignored parameter + ``mark_safe`` to allow easier usage of ``markup_join`` + """ return concat(map(str, seq)) @@ -367,7 +382,7 @@ async def _async_call(self) -> str: ) if self._context.eval_ctx.autoescape: - return Markup(rv) + return self._context.eval_ctx.mark_safe(rv) return rv @@ -379,8 +394,7 @@ def __call__(self) -> str: rv = concat(self._stack[self._depth](self._context)) if self._context.eval_ctx.autoescape: - return Markup(rv) - + rv = self._context.eval_ctx.mark_safe(rv) return rv @@ -667,6 +681,7 @@ def __init__( default_autoescape: t.Optional[bool] = None, ): self._environment = environment + self._mark_safe: EscapeFunc = environment.get_markup_class() self._func = func self._argument_count = len(arguments) self.name = name @@ -703,8 +718,14 @@ def __call__(self, *args: t.Any, **kwargs: t.Any) -> str: # argument to callables otherwise anyway. Worst case here is # that if no eval context is passed we fall back to the compile # time autoescape flag. + if args and isinstance(args[0], EvalContext): autoescape = args[0].autoescape + # If the eval context is available we use it to determine + # the correct mark safe method + # otherwise mark safe is already set in the __init__ + # function from environmental context + self._mark_safe = args[0].mark_safe args = args[1:] else: autoescape = self._default_autoescape @@ -766,7 +787,7 @@ async def _async_invoke(self, arguments: t.List[t.Any], autoescape: bool) -> str rv = await self._func(*arguments) # type: ignore if autoescape: - return Markup(rv) + return self._mark_safe(rv) return rv # type: ignore @@ -777,8 +798,7 @@ def _invoke(self, arguments: t.List[t.Any], autoescape: bool) -> str: rv = self._func(*arguments) if autoescape: - rv = Markup(rv) - + rv = self._mark_safe(rv) return rv def __repr__(self) -> str: diff --git a/src/jinja2/utils.py b/src/jinja2/utils.py index 9b5f5a50e..7240b7c43 100644 --- a/src/jinja2/utils.py +++ b/src/jinja2/utils.py @@ -5,10 +5,14 @@ import typing as t from collections import abc from collections import deque +from functools import lru_cache from random import choice from random import randrange from threading import Lock from types import CodeType +from typing import Any +from typing import Callable +from typing import Type from urllib.parse import quote_from_bytes import markupsafe @@ -17,6 +21,8 @@ import typing_extensions as te F = t.TypeVar("F", bound=t.Callable[..., t.Any]) +# Typing definition of the Escape function +EscapeFunc = t.Callable[[t.Any], markupsafe.Markup] # special singleton representing missing values for the runtime missing: t.Any = type("MissingType", (), {"__repr__": lambda x: "missing"})() @@ -224,6 +230,7 @@ def urlize( rel: t.Optional[str] = None, target: t.Optional[str] = None, extra_schemes: t.Optional[t.Iterable[str]] = None, + do_escape: t.Callable[[Any], markupsafe.Markup] = markupsafe.escape, ) -> str: """Convert URLs in text into clickable links. @@ -245,6 +252,9 @@ def urlize( :param extra_schemes: Recognize URLs that start with these schemes in addition to the default behavior. + .. versionchanged:: 3.1 + The ``do_escape`` parameter was added. + .. versionchanged:: 3.0 The ``extra_schemes`` parameter was added. @@ -269,9 +279,9 @@ def trim_url(x: str) -> str: def trim_url(x: str) -> str: return x - words = re.split(r"(\s+)", str(markupsafe.escape(text))) - rel_attr = f' rel="{markupsafe.escape(rel)}"' if rel else "" - target_attr = f' target="{markupsafe.escape(target)}"' if target else "" + words = re.split(r"(\s+)", str(do_escape(text))) + rel_attr = f' rel="{do_escape(rel)}"' if rel else "" + target_attr = f' target="{do_escape(target)}"' if target else "" for i, word in enumerate(words): head, middle, tail = "", word, "" @@ -340,9 +350,18 @@ def trim_url(x: str) -> str: def generate_lorem_ipsum( - n: int = 5, html: bool = True, min: int = 20, max: int = 100 -) -> str: - """Generate some lorem ipsum for the template.""" + n: int = 5, + html: bool = True, + min: int = 20, + max: int = 100, + mark_safe: t.Callable[[t.Any], markupsafe.Markup] = markupsafe.Markup, + do_escape: t.Callable[[t.Any], markupsafe.Markup] = markupsafe.escape, +) -> t.Union[markupsafe.Markup, str]: + """Generate some lorem ipsum for the template. + + .. versionchanged:: 3.1 + added mark_safe and do_escape parameter + """ from .constants import LOREM_IPSUM_WORDS words = LOREM_IPSUM_WORDS.split() @@ -389,9 +408,7 @@ def generate_lorem_ipsum( if not html: return "\n\n".join(result) - return markupsafe.Markup( - "\n".join(f"

{markupsafe.escape(x)}

" for x in result) - ) + return mark_safe("\n".join(f"

{do_escape(x)}

" for x in result)) def url_quote(obj: t.Any, charset: str = "utf-8", for_qs: bool = False) -> str: @@ -570,9 +587,10 @@ def __reversed__(self) -> t.Iterator[t.Any]: def select_autoescape( enabled_extensions: t.Collection[str] = ("html", "htm", "xml"), disabled_extensions: t.Collection[str] = (), + special_extensions: t.Optional[t.Dict[str, EscapeFunc]] = None, default_for_string: bool = True, default: bool = False, -) -> t.Callable[[t.Optional[str]], bool]: +) -> t.Callable[[t.Optional[str]], t.Union[bool, EscapeFunc]]: """Intelligently sets the initial value of autoescaping based on the filename of the template. This is the recommended way to configure autoescaping if you do not want to write a custom function yourself. @@ -603,17 +621,54 @@ def select_autoescape( If nothing matches then the initial value of autoescaping is set to the value of `default`. + The `special_extensions` is a dictionary which keys are the extensions + to be considered and the values are the escape function to be used + to escape this kind of files. + + I.e. if you use the `latex package `_, + you can create an environment that escapes all LaTeX files + with the correct escaper but still handle HTML files correctly:: + + from jinja2 import Environment, select_autoescape + from latex import escape as latex_escape + env = Environment(autoescape=select_autoescape( + special_extensions={'tex': latex_escape} + )) + For security reasons this function operates case insensitive. .. versionadded:: 2.9 + created function + .. versionchanged:: 3.1 + parameter ``special_extensions`` was added """ - enabled_patterns = tuple(f".{x.lstrip('.').lower()}" for x in enabled_extensions) - disabled_patterns = tuple(f".{x.lstrip('.').lower()}" for x in disabled_extensions) - def autoescape(template_name: t.Optional[str]) -> bool: + def extension_str(x: str) -> str: + """return a lower case extension always starting with point""" + return f".{x.lstrip('.').lower()}" + + enabled_patterns = tuple(extension_str(x) for x in enabled_extensions) + disabled_patterns = tuple(extension_str(x) for x in disabled_extensions) + + if special_extensions is None: + special_extensions = {} + special_extensions = { + extension_str(key): func for key, func in special_extensions.items() + } + + def autoescape(template_name: t.Optional[str]) -> t.Union[bool, EscapeFunc]: if template_name is None: return default_for_string template_name = template_name.lower() + # Lookup autoescape function using the longest matching suffix + + for key, func in sorted( + special_extensions.items(), # type: ignore + key=lambda x: len(x[0]), + reverse=True, + ): + if template_name.endswith(key): + return t.cast(EscapeFunc, func) if template_name.endswith(enabled_patterns): return True if template_name.endswith(disabled_patterns): @@ -624,11 +679,14 @@ def autoescape(template_name: t.Optional[str]) -> bool: def htmlsafe_json_dumps( - obj: t.Any, dumps: t.Optional[t.Callable[..., str]] = None, **kwargs: t.Any + obj: t.Any, + mark_safe: t.Callable[[t.Any], markupsafe.Markup], + dumps: t.Optional[t.Callable[..., str]] = None, + **kwargs: t.Any, ) -> markupsafe.Markup: """Serialize an object to a string of JSON with :func:`json.dumps`, then replace HTML-unsafe characters with Unicode escapes and mark - the result safe with :class:`~markupsafe.Markup`. + the result safe with given mark_safe function. This is available in templates as the ``|tojson`` filter. @@ -640,21 +698,26 @@ def htmlsafe_json_dumps( filter. :param obj: The object to serialize to JSON. + :param mark_safe: Class/Function that marks a string as safe :param dumps: The ``dumps`` function to use. Defaults to ``env.policies["json.dumps_function"]``, which defaults to :func:`json.dumps`. :param kwargs: Extra arguments to pass to ``dumps``. Merged onto ``env.policies["json.dumps_kwargs"]``. + .. versionchanged:: 3.1 + Added required mark_safe parameter + .. versionchanged:: 3.0 The ``dumper`` parameter is renamed to ``dumps``. + .. versionadded:: 2.9 """ if dumps is None: dumps = json.dumps - return markupsafe.Markup( + return mark_safe( dumps(obj, **kwargs) .replace("<", "\\u003c") .replace(">", "\\u003e") @@ -663,8 +726,64 @@ def htmlsafe_json_dumps( ) +# We are comparing the resulting markup classes so have to make sure +# that the same custom_escape function returns always the very same +# Markup class. This is only possible using the cache +@lru_cache(500) +def get_wrapped_escape_class( + custom_escape: Callable[[Any], str] +) -> Type[markupsafe.Markup]: + """ + Use a simple escape function to generate a wrapped Markup class + + This class uses the given ``custom_escape`` function to escape + the value and at the same time makes sure that no already escaped + string is escaped again. + + The returned class is a subclass of :class:`markupsafe.Markup`, + so it represents a complete ``str`` subclass that is marked as + safe + + :param custom_escape: The function that escapes the objects to a str + + :return: a Markup class using this escape function + + .. versionadded:: 3.1 + """ + + class MarkupWrapper(markupsafe.Markup): + """ + Make sure that the custom escape function is used + """ + + @classmethod + def get_unwrapped_escape(cls) -> t.Callable[[Any], str]: + # Needed for test + return custom_escape + + @classmethod + def escape(cls, s: Any) -> markupsafe.Markup: + """ + Make sure the custom escape function does not escape + already escaped strings + Also make sure the escaped string is marked as escaped + with the correct class + + If the object has an ``__html__`` method, it is called, + and the return value is assumed to already be safe for HTML + / resp. what ever is escaped currently. + The HTML attribute shall be understood simply as + "it is a safe string" + """ + if hasattr(s, "__html__"): + return cls(s.__html__()) + return cls(custom_escape(s)) # noqa: B902 + + return MarkupWrapper + + class Cycler: - """Cycle through values by yield them one at a time, then restarting + """Cycle through values by yielding them one at a time, then restarting once the end is reached. Available as ``cycler`` in templates. Similar to ``loop.cycle``, but can be used outside loops or across diff --git a/tests/conftest.py b/tests/conftest.py index e225ab907..ccb5ea970 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -47,3 +47,29 @@ def choice_loader(dict_loader, package_loader): def prefix_loader(filesystem_loader, dict_loader): """returns a PrefixLoader""" return loaders.PrefixLoader({"a": filesystem_loader, "b": dict_loader}) + + +@pytest.fixture +def custom_escape_func(): + """Return a simple custom escape function""" + + def dollar_to_eur(s): + return str(s).replace("$", "€") + + return dollar_to_eur + + +@pytest.fixture +def return_custom_autoescape(custom_escape_func): + """return a simple example for a custom escape function""" + + def do_return_autoescape(suffix): + return custom_escape_func + + return do_return_autoescape + + +@pytest.fixture +def env_custom_autoescape(return_custom_autoescape): + """return a simple example for a custom escape function""" + return Environment(autoescape=return_custom_autoescape) diff --git a/tests/test_api.py b/tests/test_api.py index 4db3b4a96..0face708e 100644 --- a/tests/test_api.py +++ b/tests/test_api.py @@ -70,6 +70,15 @@ def finalize(eval_ctx, value): t = e.from_string("{{ value }}") assert t.render(value="