Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 1.0.0 #53

Merged
merged 1 commit into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
Changelog
=========

v1.0.0 (upcoming)
v1.0.0 (2025-01-13)
-----------------

Ultimate Sitemap Parser is now maintained by the `GATE Team <https://gate.ac.uk/>`_ at the School of Computer Science, University of Sheffield. We'd like to thank Linas Valiukas and Hal Roberts for their work on this package, and Paige Gulley for coordinating the transfer of the library.

**Breaking Changes**

* Python v3.8 is now the lowest supported version of Python. Future releases will follow `Python's version support <https://devguide.python.org/versions/>`_.

**New Features**

* CLI tool to parse and list sitemaps on the command line (see :doc:`/reference/cli`)
Expand All @@ -13,9 +19,9 @@ v1.0.0 (upcoming)
* Added ``all_sitemaps()`` method to iterate over all descendant sitemaps

* Pickling page sitemaps now includes page data, which previously was not included as it was swapped to disk
* Sitemaps and pages now implement ``to_dict()`` method to convert to dictionaries
* Sitemaps and pages now implement ``to_dict()`` method to convert to dictionaries (requested in :issue:`18`)
* Added optional arguments to ``usp.tree.sitemap_tree_for_homepage()`` to disable robots.txt-based or known-path-based sitemap discovery. Default behaviour is still to use both.
* Parse sitemaps from a string with :ref:`local parse`
* Parse sitemaps from a string with :ref:`local parse` (requested in :issue:`26`)
* Support for the Google Image sitemap extension
* Add proxy support with ``RequestsWebClient.set_proxies()`` (:pr:`20` by :user:`tgrandje`)
* Add additional sitemap discovery paths for news sitemaps (:commit:`d3bdaae56be87c97ce2f3f845087f495f6439b44`)
Expand All @@ -34,10 +40,10 @@ Improvement of parse performance by approximately 90%:
* Invalid priorities will be set to the default (0.5) instead of crashing
* Moved ``__version__`` attribute into main class module
* Robots.txt index sitemaps now count for the max recursion depth (reported in :issue:`29`). The default maximum has been increased by 1 to compensate for this.
* Remove log configuration so it can be specified at application level (:pr:`24` by :user:`dsoprea`)
* Remove log configuration so it can be specified at application level (reported in :issue:`25`, :pr:`24` by :user:`dsoprea`/:user:`antonialoytorrens-ikaue`)
* Resolve warnings caused by :external+python:class:`http.HTTPStatus` usage (:commit:`3867b6e`)
* Don't add ``InvalidSitemap`` object if ``robots.txt`` is not found (:pr:`39` by :user:`gbenson`)
* Fix incorrect lowercasing of URLS discovered in robots.txt (:pr:`35`)
* Fix incorrect lowercasing of URLS discovered in robots.txt (reported in :issue:`40`, :pr:`35` by :user:`ArthurMelin`)


Prior versions
Expand Down
Loading
Loading