Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==4.2.1
->==4.9.1
By merging this PR, the issue #5 will be automatically resolved and closed:
Release Notes
lxml/lxml (lxml)
v4.9.1
Compare Source
==================
Bugs fixed
iterwalk()
(orcanonicalize()
)after parsing certain incorrect input. Note that
iterwalk()
can crashon valid input parsed with the same parser after failing to parse the
incorrect input.
v4.9.0
Compare Source
==================
Bugs fixed
lxml.html
was corrected.Patch by xmo-odoo.
Other changes
Built with Cython 0.29.30 to adapt to changes in Python 3.11 and 3.12.
Wheels include zlib 1.2.12, libxml2 2.9.14 and libxslt 1.1.35
(libxml2 2.9.12+ and libxslt 1.1.34 on Windows).
GH#343: Windows-AArch64 build support in Visual Studio.
Patch by Steve Dower.
v4.8.0
Compare Source
==================
Features added
GH#337: Path-like objects are now supported throughout the API instead of just strings.
Patch by Henning Janssen.
The
ElementMaker
now supportsQName
values as tags, which always overridethe default namespace of the factory.
Bugs fixed
lower case, whereas XML Schema datatypes define them as "NaN" and "INF" respectively.
Patch by Tobias Deiminger.
Other changes
v4.7.1
Compare Source
==================
Features added
parser.feed()
now encodes the input datato the native UTF-8 encoding directly, instead of going through
Py_UNICODE
/wchar_t
encoding first, which previously required duplicate recoding in most cases.Bugs fixed
The standard namespace prefixes were mishandled during "C14N2" serialisation on Python 3.
See https://mail.python.org/archives/list/[email protected]/thread/6ZFBHFOVHOS5GFDOAMPCT6HM5HZPWQ4Q/
lxml.objectify
previously accepted non-XML numbers with underscores (like "1_000")as integers or float values in Python 3.6 and later. It now adheres to the number
format of the XML spec again.
LP#1939031: Static wheels of lxml now contain the header files of zlib and libiconv
(in addition to the already provided headers of libxml2/libxslt/libexslt).
Other changes
v4.6.5
Compare Source
==================
Bugs fixed
A vulnerability (GHSL-2021-1038) in the HTML cleaner allowed sneaking script
content through SVG images (CVE-2021-43818).
A vulnerability (GHSL-2021-1037) in the HTML cleaner allowed sneaking script
content through CSS imports and other crafted constructs (CVE-2021-43818).
v4.6.4
Compare Source
==================
Features added
GH#317: A new property
system_url
was added to DTD entities.Patch by Thirdegree.
GH#314: The
STATIC_*
variables insetup.py
can now be passed via env vars.Patch by Isaac Jurado.
v4.6.3
Compare Source
==================
Bugs fixed
which allowed JavaScript to pass through. The cleaner now removes the HTML5
formaction
attribute.v4.6.2
Compare Source
==================
Bugs fixed
which allowed JavaScript to pass through. The cleaner now removes more sneaky
"style" content.
v4.6.1
Compare Source
==================
Bugs fixed
JavaScript to pass through. The cleaner now removes more sneaky "style" content.
v4.6.0
Compare Source
==================
Features added
GH#310:
lxml.html.InputGetter
supports__len__()
to count the number of input fields.Patch by Aidan Woolley.
lxml.html.InputGetter
has a new.items()
method to ease processing all input fields.lxml.html.InputGetter.keys()
now returns the field names in document order.GH-309: The API documentation is now generated using
sphinx-apidoc
.Patch by Chris Mayo.
Bugs fixed
LP#1869455: C14N 2.0 serialisation failed for unprefixed attributes
when a default namespace was defined.
TreeBuilder.close()
raisedAssertionError
in some error cases where itshould have raised
XMLSyntaxError
. It now raises a combined exception tokeep up backwards compatibility, while switching to
XMLSyntaxError
as aninterface.
v4.5.2
Compare Source
==================
Bugs fixed
Cleaner()
now validates that only known configuration options can be set.LP#1882606:
Cleaner.clean_html()
discarded comments and PIs regardless of thecorresponding configuration option, if
remove_unknown_tags
was set.LP#1880251: Instead of globally overwriting the document loader in libxml2, lxml now
sets it per parser run, which improves the interoperability with other users of libxml2
such as libxmlsec.
LP#1881960: Fix build in CPython 3.10 by using Cython 0.29.21.
The setup options "--with-xml2-config" and "--with-xslt-config" were accidentally renamed
to "--xml2-config" and "--xslt-config" in 4.5.1 and are now available again.
v4.5.1
Compare Source
==================
Bugs fixed
LP#1570388: Fix failures when serialising documents larger than 2GB in some cases.
LP#1865141, GH#298:
QName
values were not accepted by theel.iter()
method.Patch by xmo-odoo.
LP#1863413, GH#297: The build failed to detect libraries on Linux that are only
configured via pkg-config.
Patch by Hugh McMaster.
v4.5.0
Compare Source
==================
Features added
indent()
was added to insert tail whitespace for pretty-printingan XML tree.
Bugs fixed
deletion disappeared silently instead of sticking with the node that was removed.
Other changes
MacOS builds are 64-bit-only by default.
Set CFLAGS and LDFLAGS explicitly to override it.
Linux/MacOS Binary wheels now use libxml2 2.9.10 and libxslt 1.1.34.
LP#1840234: The package version number is now available as
lxml.__version__
.v4.4.3
Compare Source
==================
Bugs fixed
itertext()
was missing tail text of comments and PIs since 4.4.0.v4.4.2
Compare Source
==================
Bugs fixed
ElementInclude
incorrectly rejected repeated non-recursiveincludes as recursive.
Patch by Rainer Hausdorf.
v4.4.1
Compare Source
==================
Bugs fixed
LP#1838252: The order of an OrderedDict was lost in 4.4.0 when passing it as
attrib mapping during element creation.
LP#1838521: The package metadata now lists the supported Python versions.
v4.4.0
Compare Source
==================
Features added
Element.clear()
accepts a new keyword argumentkeep_tail=True
to cleareverything but the tail text. This is helpful in some document-style use cases
and for clearing the current element in
iterparse()
and pull parsing.When creating attributes or namespaces from a dict in Python 3.6+, lxml now
preserves the original insertion order of that dict, instead of always sorting
the items by name. A similar change was made for ElementTree in CPython 3.8.
See https://bugs.python.org/issue34160
Integer elements in
lxml.objectify
implement the__index__()
special method.GH#269: Read-only elements in XSLT were missing the
nsmap
property.Original patch by Jan Pazdziora.
ElementInclude can now restrict the maximum inclusion depth via a
max_depth
argument to prevent content explosion. It is limited to 6 by default.
The
target
object of the XMLParser can havestart_ns()
andend_ns()
callback methods to listen to namespace declarations.
The
TreeBuilder
has new argumentscomment_factory
andpi_factory
topass factories for creating comments and processing instructions, as well as
flag arguments
insert_comments
andinsert_pis
to discard them from thetree when set to false.
A
C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>
_ implementation was added asetree.canonicalize()
, a correspondingC14NWriterTarget
class, anda
c14n2
serialisation method.Bugs fixed
When writing to file paths that contain the URL escape character '%', the file
path could wrongly be mangled by URL unescaping and thus write to a different
file or directory. Code that writes to file paths that are provided by untrusted
sources, but that must work with previous versions of lxml, should best either
reject paths that contain '%' characters, or otherwise make sure that the path
does not contain maliciously injected '%XX' URL hex escapes for paths like '../'.
Assigning to Element child slices with negative step could insert the slice at
the wrong position, starting too far on the left.
Assigning to Element child slices with overly large step size could take very
long, regardless of the length of the actual slice.
Assigning to Element child slices of the wrong size could sometimes fail to
raise a ValueError (like a list assignment would) and instead assign outside
of the original slice bounds or leave parts of it unreplaced.
The
comment
andpi
events initerwalk()
were never triggered, andinstead, comments and processing instructions in the tree were reported as
start
elements. Also, when walking an ElementTree (as opposed to its rootelement), comments and PIs outside of the root element are now reported.
LP#1827833: The RelaxNG compact syntax support was broken with recent versions
of
rnc2rng
.LP#1758553: The HTML elements
source
andtrack
were added to the listof empty tags in
lxml.html.defs
.Registering a prefix other than "xml" for the XML namespace is now rejected.
Failing to write XSLT output to a file could raise a misleading exception.
It now raises
IOError
.Other changes
Support for Python 3.4 was removed.
When using
Element.find*()
with prefix-namespace mappings, the empty stringis now accepted to define a default namespace, in addition to the previously
supported
None
prefix. Empty strings are more convenient since they keepall prefix keys in a namespace dict strings, which simplifies sorting etc.
The
ElementTree.write_c14n()
method has been deprecated in favour of thelong preferred
ElementTree.write(f, method="c14n")
. It will be removedin a future release.
v4.3.5
Compare Source
==================
v4.3.4
Compare Source
==================
v4.3.3
Compare Source
==================
Bugs fixed
_XSLTResultTree.write_output()
.v4.3.2
Compare Source
==================
Bugs fixed
Other changes
v4.3.0
Compare Source
==================
Features added
The module
lxml.sax
is compiled using Cython in order to speed it up.GH#267:
lxml.sax.ElementTreeProducer
now preserves the namespace prefixes.If two prefixes point to the same URI, the first prefix in alphabetical order
is used. Patch by Lennart Regebro.
Updated ISO-Schematron implementation to 2013 version (now MIT licensed)
and the corresponding schema to the 2016 version (with optional "properties").
Other changes
GH#270, GH#271: Support for Python 2.6 and 3.3 was removed.
Patch by hugovk.
The minimum dependency versions were raised to libxml2 2.9.2 and libxslt 1.1.27,
which were released in 2014 and 2012 respectively.
Built with Cython 0.29.2.
v4.2.6
Compare Source
==================
Bugs fixed
LP#1799755: Fix a DeprecationWarning in Py3.7+.
Import warnings in Python 3.6+ were resolved.
v4.2.5
Compare Source
==================
Bugs fixed
Security problem found by Omar Eissa. (CVE-2018-19787)
v4.2.4
Compare Source
==================
Features added
pkg-config
for build configuration.Patch by Patrick Griffis.
Bugs fixed
Element.insert()
.Patch by Alexander Weggerle.
v4.2.3
Compare Source
==================
Bugs fixed
v4.2.2
Compare Source
==================
Bugs fixed
GH#266: Fix sporadic crash during GC when parse-time schema validation is used
and the parser participates in a reference cycle.
Original patch by Julien Greard.
GH#265: lxml no longer links against zlib as a shared library, only on static builds.
Patch by Nehal J Wani.