Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intersphinx cache handling #12087

Merged
Merged
Show file tree
Hide file tree
Changes from 74 commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
e50c662
fix intersphinx cache loading in incremental builds
picnixz Oct 4, 2023
ca545a3
fix lint0
picnixz Oct 4, 2023
cfcb4f5
Remove debug print
picnixz Oct 5, 2023
9d436b3
Merge branch 'sphinx-doc:master' into fix/11466-intersphinx-inventory…
picnixz Oct 5, 2023
92243a4
save
picnixz Oct 5, 2023
37250f9
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Oct 9, 2023
4b1ac3f
Merge branch 'fix/11466-intersphinx-inventory-consistency' of github.…
picnixz Feb 3, 2024
4f8bb9e
Merge remote-tracking branch 'upstream/master' into fix/11466-intersp…
picnixz Feb 3, 2024
56f5533
update implementation and comments
picnixz Feb 3, 2024
5ef7919
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Feb 3, 2024
8bbcd83
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Feb 12, 2024
333886a
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Feb 13, 2024
ac22d65
update logic and refactor
picnixz Feb 13, 2024
822aa88
remove CHANGELOG entry until 8.x
picnixz Feb 14, 2024
6d60665
implement intersphinx new format
picnixz Feb 14, 2024
1e2b875
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Feb 14, 2024
760e4fc
cleanup
picnixz Feb 14, 2024
eea6a9d
cleanup 3.9
picnixz Feb 14, 2024
9cdd373
remove typing_extensions dependency
picnixz Feb 14, 2024
ae80634
cleanup comment
picnixz Feb 14, 2024
14836d4
Add typing
picnixz Mar 11, 2024
bb09f20
apply formatter
picnixz Mar 11, 2024
3466db0
deprecate intersphinx alpha format
picnixz Mar 11, 2024
7277f71
deprecate intersphinx alpha format
picnixz Mar 11, 2024
a6fdcec
upgrade tests
picnixz Mar 11, 2024
479a3fe
Update doc
picnixz Mar 11, 2024
a9dd038
Merge branch 'sphinx-doc:master' into core/deprecate-intersphinx-1-0
picnixz Mar 14, 2024
5e7c383
Merge branch 'sphinx-doc:master' into core/deprecate-intersphinx-1-0
picnixz Mar 14, 2024
f906314
fixup
picnixz Mar 14, 2024
ca22f10
Merge branch 'master' into core/deprecate-intersphinx-1-0
picnixz Mar 14, 2024
bdfb3e8
Merge branch 'sphinx-doc:master' into core/deprecate-intersphinx-1-0
picnixz Mar 14, 2024
d4b4227
Merge branch 'sphinx-doc:master' into fix/11466-intersphinx-inventory…
picnixz Mar 14, 2024
8c7894b
Update doc
picnixz Mar 11, 2024
128c324
fixup
picnixz Mar 14, 2024
1821d17
fixup
picnixz Mar 14, 2024
d95d046
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Mar 17, 2024
f824dc2
deprecate intersphinx alpha format
picnixz Mar 11, 2024
55effd8
upgrade tests
picnixz Mar 11, 2024
472eeab
Update doc
picnixz Mar 11, 2024
f98b47a
Merge branch 'master' into core/deprecate-intersphinx-1-0
picnixz Apr 27, 2024
7d3749d
Merge branch 'master' into core/deprecate-intersphinx-1-0
picnixz May 1, 2024
50c3988
Merge branch 'master' into core/deprecate-intersphinx-1-0
picnixz May 21, 2024
0dd0262
Merge remote-tracking branch 'upstream/master' into core/deprecate-in…
picnixz Jul 20, 2024
902016d
Merge remote-tracking branch 'upstream/master' into fix/11466-intersp…
picnixz Jul 20, 2024
7655ba5
Merge remote-tracking branch 'upstream/master' into fix/11466-intersp…
picnixz Jul 20, 2024
47c3383
remove debug
picnixz Jul 20, 2024
ddee17e
Merge branch 'core/deprecate-intersphinx-1-0' into fix/11466-intersph…
picnixz Jul 20, 2024
ad86c96
fixups
picnixz Jul 20, 2024
ba53d83
Merge branch 'master' into core/deprecate-intersphinx-1-0
AA-Turner Jul 21, 2024
3912820
raise a ConfigError
AA-Turner Jul 21, 2024
a28674a
Update test_ext_inheritance_diagram.py
picnixz Jul 21, 2024
864713b
Update sphinx/ext/intersphinx/_load.py
picnixz Jul 21, 2024
cfe69ba
Update sphinx/ext/intersphinx/_load.py
picnixz Jul 21, 2024
bea91dc
Update tests/test_extensions/test_ext_inheritance_diagram.py
picnixz Jul 21, 2024
8ada5cd
Better error messages
AA-Turner Jul 21, 2024
9dd36d1
Formatting
AA-Turner Jul 22, 2024
c33db5c
Explicit copy
AA-Turner Jul 22, 2024
35144c8
update ENV_VERSION
picnixz Jul 22, 2024
487099c
update ENV_VERSION
picnixz Jul 22, 2024
398ad43
Merge branch 'core/deprecate-intersphinx-1-0' into fix/11466-intersph…
picnixz Jul 22, 2024
7b4b3c5
Merge remote-tracking branch 'upstream/master' into fix/11466-intersp…
picnixz Jul 22, 2024
b0b25de
fixup
picnixz Jul 22, 2024
3b1126d
remove duplicated test
picnixz Jul 22, 2024
e1e6a11
update tests
picnixz Jul 22, 2024
91f1aa7
Merge remote-tracking branch 'upstream/master' into fix/11466-intersp…
picnixz Jul 22, 2024
5b9693e
update typing
picnixz Jul 22, 2024
bd423ff
fix some error messages
picnixz Jul 22, 2024
9d1f331
address Jay's review
picnixz Jul 22, 2024
1a1bee6
Merge remote-tracking branch 'upstream/master' into fix/11466-intersp…
picnixz Jul 22, 2024
ea91173
.
picnixz Jul 22, 2024
fe0c63a
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
picnixz Jul 22, 2024
f5c075b
revert style changes
picnixz Jul 22, 2024
bfe45f5
revert style changes
picnixz Jul 22, 2024
a90be51
address review
picnixz Jul 22, 2024
af2e689
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
AA-Turner Jul 22, 2024
124f1c1
Updates
AA-Turner Jul 22, 2024
7f9b7b0
Update sphinx/ext/intersphinx/_load.py
AA-Turner Jul 22, 2024
d77eaad
fix CI/CD
picnixz Jul 22, 2024
2ed2497
split tests
picnixz Jul 22, 2024
124e6b3
Merge branch 'master' into fix/11466-intersphinx-inventory-consistency
AA-Turner Jul 22, 2024
fab0e89
Update tests
AA-Turner Jul 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 62 additions & 55 deletions sphinx/ext/intersphinx/_load.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import functools
import posixpath
import time
from operator import itemgetter
from os import path
from typing import TYPE_CHECKING
from urllib.parse import urlsplit, urlunsplit
Expand Down Expand Up @@ -138,36 +139,38 @@ def load_mappings(app: Sphinx) -> None:
intersphinx_cache: dict[InventoryURI, InventoryCacheEntry] = inventories.cache
intersphinx_mapping: IntersphinxMapping = app.config.intersphinx_mapping

expected_uris = {uri for _name, (uri, _invs) in intersphinx_mapping.values()}

# If the current cache contains some (project, uri) pair
# say ("foo", "foo.com") and if the new intersphinx dict
# contains the pair ("foo", "bar.com"), we need to remove
# the ("foo", "foo.com") entry and use ("foo", "bar.com").
for uri in frozenset(intersphinx_cache):
if intersphinx_cache[uri][0] not in intersphinx_mapping or uri not in expected_uris:
# remove a cached inventory if the latter is no more used by intersphinx
del intersphinx_cache[uri]

with concurrent.futures.ThreadPoolExecutor() as pool:
futures = []
for name, (uri, invs) in intersphinx_mapping.values():
futures.append(pool.submit(
fetch_inventory_group, name, uri, invs, intersphinx_cache, app, now,
))
futures = [
pool.submit(fetch_inventory_group, name, uri, invs, intersphinx_cache, app, now)
for name, (uri, invs) in app.config.intersphinx_mapping.values()
]
updated = [f.result() for f in concurrent.futures.as_completed(futures)]

if any(updated):
# clear the local inventories
inventories.clear()

# Duplicate values in different inventories will shadow each
# other; which one will override which can vary between builds
# since they are specified using an unordered dict. To make
# it more consistent, we sort the named inventories and then
# add the unnamed inventories last. This means that the
# unnamed inventories will shadow the named ones but the named
# ones can still be accessed when the name is specified.
named_vals = []
unnamed_vals = []
for name, _expiry, invdata in intersphinx_cache.values():
if name:
named_vals.append((name, invdata))
else:
unnamed_vals.append((name, invdata))
for name, invdata in sorted(named_vals) + unnamed_vals:
if name:
inventories.named_inventory[name] = invdata
for type, objects in invdata.items():
inventories.main_inventory.setdefault(type, {}).update(objects)
# other and which one will override which varies between builds.
#
# We can however order the cache by (NAME, EXPIRY) for reproducibility.
by_name_and_time = itemgetter(0, 1) # 0: name, 1: expiry
cache_values = sorted(intersphinx_cache.values(), key=by_name_and_time)
picnixz marked this conversation as resolved.
Show resolved Hide resolved
for name, _expiry, invdata in cache_values:
picnixz marked this conversation as resolved.
Show resolved Hide resolved
inventories.named_inventory[name] = invdata
for objtype, objects in invdata.items():
inventories.main_inventory.setdefault(objtype, {}).update(objects)


def fetch_inventory_group(
Expand All @@ -179,39 +182,43 @@ def fetch_inventory_group(
now: int,
) -> bool:
cache_time = now - app.config.intersphinx_cache_limit * 86400

updated = False
failures = []
try:
for inv in invs:
if not inv:
inv = posixpath.join(uri, INVENTORY_FILENAME)
# decide whether the inventory must be read: always read local
# files; remote ones only if the cache time is expired
if '://' not in inv or uri not in cache or cache[uri][1] < cache_time:
safe_inv_url = _get_safe_url(inv)
inv_descriptor = name or 'main_inventory'
LOGGER.info(__("loading intersphinx inventory '%s' from %s..."),
inv_descriptor, safe_inv_url)
try:
invdata = fetch_inventory(app, uri, inv)
except Exception as err:
failures.append(err.args)
continue
if invdata:
cache[uri] = name, now, invdata
return True
return False
finally:
if not failures:
pass
elif len(failures) < len(invs):
LOGGER.info(__('encountered some issues with some of the inventories,'
' but they had working alternatives:'))
for fail in failures:
LOGGER.info(*fail)
else:
issues = '\n'.join(f[0] % f[1:] for f in failures)
LOGGER.warning(__('failed to reach any of the inventories '
'with the following issues:') + '\n' + issues)

for location in invs:
inv: str = location or posixpath.join(uri, INVENTORY_FILENAME)
AA-Turner marked this conversation as resolved.
Show resolved Hide resolved
# decide whether the inventory must be read: always read local
# files; remote ones only if the cache time is expired
if '://' not in inv or uri not in cache or cache[uri][1] < cache_time:
safe_inv_url = _get_safe_url(inv)
picnixz marked this conversation as resolved.
Show resolved Hide resolved
inv_descriptor = name or 'main_inventory'
LOGGER.info(__("loading intersphinx inventory '%s' from %s..."),
AA-Turner marked this conversation as resolved.
Show resolved Hide resolved
inv_descriptor, safe_inv_url)

try:
invdata = fetch_inventory(app, uri, inv)
except Exception as err:
failures.append(err.args)
continue

if invdata:
cache[uri] = name, now, invdata
updated = True
break

if not failures:
pass
elif len(failures) < len(invs):
LOGGER.info(__("encountered some issues with some of the inventories,"
" but they had working alternatives:"))
for fail in failures:
LOGGER.info(*fail)
else:
issues = '\n'.join(f[0] % f[1:] for f in failures)
LOGGER.warning(__("failed to reach any of the inventories "
"with the following issues:") + "\n" + issues)
return updated


def fetch_inventory(app: Sphinx, uri: InventoryURI, inv: str) -> Inventory:
Expand Down
210 changes: 209 additions & 1 deletion tests/test_extensions/test_ext_intersphinx.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
from __future__ import annotations

import http.server
import posixpath
import re
import zlib
from io import BytesIO
from typing import TYPE_CHECKING
from unittest import mock

Expand Down Expand Up @@ -31,7 +35,125 @@
from tests.utils import http_server

if TYPE_CHECKING:
from typing import NoReturn
from collections.abc import Iterable
from typing import BinaryIO, NoReturn

from sphinx.util.typing import InventoryItem


class InventoryEntry:
__slots__ = (
'name', 'display_name', 'domain_name', 'object_type', 'uri', 'anchor', 'priority',
)

def __init__(
self,
name: str = 'this',
*,
display_name: str | None = None,
domain_name: str = 'py',
object_type: str = 'obj',
uri: str = 'index.html',
anchor: str = '',
priority: int = 0,
):
if anchor.endswith(name):
anchor = anchor[:-len(name)] + '$'

if anchor:
uri += '#' + anchor

if display_name is None or display_name == name:
display_name = '-'

self.name = name
self.display_name = display_name
self.domain_name = domain_name
self.object_type = object_type
self.uri = uri
self.anchor = anchor
self.priority = priority

def format(self) -> str:
return (f'{self.name} {self.domain_name}:{self.object_type} '
f'{self.priority} {self.uri} {self.display_name}\n')

def __repr__(self) -> str:
fields = (f'{attr}={getattr(self, attr)!r}' for attr in self.__slots__)
return f"{self.__class__.__name__}({', '.join(fields)})"


class IntersphinxProject:
def __init__(
self,
name: str = 'foo',
version: str | int = 1,
baseurl: str = '',
baseuri: str = '',
file: str | None = None,
) -> None:
#: The project name.
self.name = name
#: The escaped project name.
self.safe_name = re.sub(r'\\s+', ' ', name)

#: The project version as a string.
self.version = version = str(version)
#: The escaped project version.
self.safe_version = re.sub(r'\\s+', ' ', version)

#: The project base URL (e.g., http://localhost:7777).
self.baseurl = baseurl
#: The project base URI, relative to *baseurl* (e.g., 'foo').
self.uri = baseuri
#: The project URL, as specified in :confval:`intersphinx_mapping`.
self.url = posixpath.join(baseurl, baseuri)
#: The project local file, if any.
self.file = file

@property
def record(self) -> dict[str, tuple[str | None, str | None]]:
"""The :confval:`intersphinx_mapping` record for this project."""
return {self.name: (self.url, self.file)}

def normalize(self, entry: InventoryEntry) -> tuple[str, InventoryItem]:
url = posixpath.join(self.url, entry.uri)
return entry.name, (self.safe_name, self.safe_version, url, entry.display_name)


class FakeInventory:
protocol_version: int

def __init__(self, project: IntersphinxProject | None = None) -> None:
self.project = project or IntersphinxProject()

def serialize(self, entries: Iterable[InventoryEntry] | None = None) -> bytes:
buffer = BytesIO()
self._write_headers(buffer)
entries = entries or [InventoryEntry()]
self._write_body(buffer, (item.format().encode() for item in entries))
return buffer.getvalue()

def _write_headers(self, buffer: BinaryIO) -> None:
buffer.write((f'# Sphinx inventory version {self.protocol_version}\n'
f'# Project: {self.project.safe_name}\n'
f'# Version: {self.project.safe_version}\n').encode())

def _write_body(self, buffer: BinaryIO, lines: Iterable[bytes]) -> None:
raise NotImplementedError


class FakeInventoryV2(FakeInventory):
protocol_version = 2

def _write_headers(self, buffer: BinaryIO) -> None:
super()._write_headers(buffer)
buffer.write(b'# The remainder of this file is compressed using zlib.\n')

def _write_body(self, buffer: BinaryIO, lines: Iterable[bytes]) -> None:
compressor = zlib.compressobj(9)
buffer.writelines(map(compressor.compress, lines))
buffer.write(compressor.flush())


class FakeList(list):
Expand Down Expand Up @@ -502,6 +624,92 @@ def test_load_mappings_fallback(tmp_path, app, status, warning):
assert isinstance(rn, nodes.reference)


@pytest.mark.sphinx('dummy', testroot='basic')
def test_load_mappings_cache_update(make_app, app_params):
ITEM_NAME = 'bar'
DOMAIN_NAME = 'py'
OBJECT_TYPE = 'module'
REFTYPE = f'{DOMAIN_NAME}:{OBJECT_TYPE}'
PORT = 7777 # needd since otherwise it's an automatic port

PROJECT_NAME, PROJECT_BASEURL = 'foo', f'http://localhost:{PORT}'
old_project = IntersphinxProject(PROJECT_NAME, 1337, PROJECT_BASEURL, 'old')
assert old_project.name == PROJECT_NAME
assert old_project.url == f'http://localhost:{PORT}/old'

new_project = IntersphinxProject(PROJECT_NAME, 1701, PROJECT_BASEURL, 'new')
assert new_project.name == PROJECT_NAME
assert new_project.url == f'http://localhost:{PORT}/new'

def make_entry(project: IntersphinxProject) -> InventoryEntry:
name = f'{ITEM_NAME}_{project.version}'
return InventoryEntry(name, domain_name=DOMAIN_NAME, object_type=OBJECT_TYPE)

def make_invdata(project: IntersphinxProject) -> bytes:
return FakeInventoryV2(project).serialize([make_entry(project)])

class InventoryHandler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200, 'OK')

data = b''
for project in (old_project, new_project):
if self.path.startswith(f'/{project.uri}/'):
data = make_invdata(project)
break

self.send_header('Content-Length', str(len(data)))
self.end_headers()
self.wfile.write(data)

def log_message(*args, **kwargs):
pass

# clean build
args, kwargs = app_params
_ = make_app(*args, freshenv=True, **kwargs)
_.build()

baseconfig = {'extensions': ['sphinx.ext.intersphinx']}

with http_server(InventoryHandler, port=PORT):
confoverrides1 = baseconfig | {'intersphinx_mapping': old_project.record}
app1 = make_app(*args, confoverrides=confoverrides1, **kwargs)
app1.build()

# the inventory when querying the 'old' URL
old_entry = make_entry(old_project)
old_item = dict([old_project.normalize(old_entry)])

assert list(app1.env.intersphinx_cache) == [old_project.url]
assert app1.env.intersphinx_cache[old_project.url][0] == old_project.name
assert app1.env.intersphinx_cache[old_project.url][2] == {REFTYPE: old_item}
assert app1.env.intersphinx_named_inventory == {old_project.name: {REFTYPE: old_item}}

# switch to new url and assert that the old URL is no more stored
confoverrides2 = baseconfig | {'intersphinx_mapping': new_project.record}
app2 = make_app(*args, confoverrides=confoverrides2, **kwargs)
app2.build()

new_entry = make_entry(new_project)
new_item = dict([new_project.normalize(new_entry)])

assert list(app2.env.intersphinx_cache) == [new_project.url]
assert app2.env.intersphinx_cache[new_project.url][0] == PROJECT_NAME
assert app2.env.intersphinx_cache[new_project.url][2] == {REFTYPE: new_item}
assert app2.env.intersphinx_named_inventory == {PROJECT_NAME: {REFTYPE: new_item}}

# switch back to old url (re-use 'old_item')
confoverrides3 = baseconfig | {'intersphinx_mapping': old_project.record}
app3 = make_app(*args, confoverrides=confoverrides3, **kwargs)
app3.build()

assert list(app3.env.intersphinx_cache) == [old_project.url]
assert app3.env.intersphinx_cache[old_project.url][0] == PROJECT_NAME
assert app3.env.intersphinx_cache[old_project.url][2] == {REFTYPE: old_item}
assert app3.env.intersphinx_named_inventory == {PROJECT_NAME: {REFTYPE: old_item}}


class TestStripBasicAuth:
"""Tests for sphinx.ext.intersphinx._strip_basic_auth()"""

Expand Down