Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML search: sanitise 'searchindex.js' contents #13099

Closed
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
37e92ab
HTML search: sanitise `searchindex.js` contents
jayaddison Nov 3, 2024
58ca700
Add CHANGES.rst entry
jayaddison Nov 3, 2024
1554551
Fixup: code-indentation correction
jayaddison Nov 4, 2024
056c85b
Perf: refactor `sanitiseRecursive`
jayaddison Nov 4, 2024
6a38587
Perf: additional `sanitiseRecursive` refactoring
jayaddison Nov 4, 2024
015ecb9
Prototype (unbenchmarked): remove nested-arrays from `searchindex.js`
jayaddison Nov 4, 2024
a3a1a31
Fixup: add missing counter-increment
jayaddison Nov 5, 2024
db3aaf5
Tests: regenerate HTML search test fixtures
jayaddison Nov 5, 2024
a72b1ff
Fixup: ignore entry-number keys when re-opening search index from file
jayaddison Nov 5, 2024
13581d1
Tests: update expectations
jayaddison Nov 5, 2024
94857f7
Typing: add parameterized typehints to `Counter` variables
jayaddison Nov 5, 2024
ca7490d
Linting: resolve `ruff` PERF102 suggestions
jayaddison Nov 5, 2024
6df348f
Revert "Linting: resolve `ruff` PERF102 suggestions"
jayaddison Nov 5, 2024
dab072f
Revert "Typing: add parameterized typehints to `Counter` variables"
jayaddison Nov 5, 2024
dbafb5f
Revert "Tests: update expectations"
jayaddison Nov 5, 2024
5ab5f1d
Revert "Fixup: ignore entry-number keys when re-opening search index …
jayaddison Nov 5, 2024
7a75647
Revert "Tests: regenerate HTML search test fixtures"
jayaddison Nov 5, 2024
8c2f7e2
Revert "Fixup: add missing counter-increment"
jayaddison Nov 5, 2024
26ec589
Revert "Prototype (unbenchmarked): remove nested-arrays from `searchi…
jayaddison Nov 5, 2024
c0af102
Tests: add test coverage to demonstrate mutating a nested array element
jayaddison Nov 5, 2024
0248ccc
sanitiseRecursive: implement recursive array sanitisation
jayaddison Nov 5, 2024
6542c8b
Refactor: extract common `evaluate` function path
jayaddison Nov 5, 2024
2ada40e
Refactor: avoid possibiliy of repeat `evaluate` construction
jayaddison Nov 5, 2024
e6ecded
Refactor: `sanitiseRecursive` brevity/performance
jayaddison Nov 5, 2024
6b4b675
Refactor: translate from `sanitiseRecursive` to an iterative approach
jayaddison Nov 5, 2024
ab841c6
Performance: avoid `Object.values` call for `Array` instances
jayaddison Nov 6, 2024
0e61a3c
Performance: omit non-`Object` items from the stack
jayaddison Nov 6, 2024
b2de18a
Refactor: rectify variable name
jayaddison Nov 6, 2024
6aa4f7f
Refactor / performance: consolidate non-emptiness check with stack po…
jayaddison Nov 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@ Deprecated
Features added
--------------

* #13098: HTML Search: the contents of the search index are sanitised
(reassigned to frozen null-prototype JavaScript objects), to reduce
the risk of unintended access or modification.
Patch by James Addison

Bugs fixed
----------

Expand Down
30 changes: 20 additions & 10 deletions sphinx/search/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import os
import pickle
import re
from collections import Counter
from importlib import import_module
from pathlib import Path
from typing import IO, TYPE_CHECKING, Any
Expand Down Expand Up @@ -334,7 +335,7 @@
for docname in self._titles:
self._all_titles[docname] = []
for title, doc_tuples in frozen['alltitles'].items():
for doc, titleid in doc_tuples:
for _, (doc, titleid) in doc_tuples.items():

Check failure on line 338 in sphinx/search/__init__.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (PERF102)

sphinx/search/__init__.py:338:38: PERF102 When using only the values of a dict use the `values()` method
self._all_titles[index2fn[doc]].append((title, titleid))

def load_terms(mapping: dict[str, Any]) -> dict[str, set[str]]:
Expand All @@ -358,10 +359,11 @@

def get_objects(
self, fn2index: dict[str, int]
) -> dict[str, list[tuple[int, int, int, str, str]]]:
rv: dict[str, list[tuple[int, int, int, str, str]]] = {}
) -> dict[str, dict[int, tuple[int, int, int, str, str]]]:
rv: dict[str, dict[int, tuple[int, int, int, str, str]]] = {}
otypes = self._objtypes
onames = self._objnames
prefix_count = Counter()
for domain in self.env.domains.sorted():
sorted_objects = sorted(domain.get_objects())
for fullname, dispname, type, docname, anchor, prio in sorted_objects:
Expand All @@ -372,7 +374,8 @@
fullname = html.escape(fullname)
dispname = html.escape(dispname)
prefix, _, name = dispname.rpartition('.')
plist = rv.setdefault(prefix, [])
index = prefix_count[prefix]
plist = rv.setdefault(prefix, {})
try:
typeindex = otypes[domain.name, type]
except KeyError:
Expand All @@ -394,7 +397,8 @@
shortanchor = '-'
else:
shortanchor = anchor
plist.append((fn2index[docname], typeindex, prio, shortanchor, name))
plist[index] = (fn2index[docname], typeindex, prio, shortanchor, name)
prefix_count[prefix] += 1
return rv

def get_terms(
Expand Down Expand Up @@ -429,19 +433,25 @@
objtypes = {v: k[0] + ':' + k[1] for (k, v) in self._objtypes.items()}
objnames = self._objnames

alltitles: dict[str, list[tuple[int, str | None]]] = {}
alltitles_count = Counter()
alltitles: dict[str, dict[int, tuple[int, str | None]]] = {}
for docname, titlelist in sorted(self._all_titles.items()):
for title, titleid in titlelist:
alltitles.setdefault(title, []).append((fn2index[docname], titleid))
count = alltitles_count[title]
alltitles.setdefault(title, {})[count] = (fn2index[docname], titleid)
alltitles_count[title] += 1

index_entries: dict[str, list[tuple[int, str, bool]]] = {}
index_entry_count = Counter()
index_entries: dict[str, dict[int, tuple[int, str, bool]]] = {}
for docname, entries in self._index_entries.items():
for entry, entry_id, main_entry in entries:
index_entries.setdefault(entry.lower(), []).append((
count = index_entry_count[entry]
index_entries.setdefault(entry.lower(), {})[count] = (
fn2index[docname],
entry_id,
main_entry == 'main',
))
)
index_entry_count[entry] += 1

return {
'docnames': docnames,
Expand Down
23 changes: 16 additions & 7 deletions sphinx/themes/basic/static/searchtools.js
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,16 @@ const Search = {
(document.body.appendChild(document.createElement("script")).src = url),

setIndex: (index) => {
Search._index = index;
const sanitiseRecursive = function(obj) {
const result = Object.create(null);
for (const [k, v] of Object.entries(obj)) {
if (!(v instanceof Object)) result[k] = v;
else if (Array.isArray(v)) result[k] = Object.freeze(v);
jayaddison marked this conversation as resolved.
Show resolved Hide resolved
else result[k] = sanitiseRecursive(v);
}
return Object.freeze(result);
}
Search._index = sanitiseRecursive(index);
jayaddison marked this conversation as resolved.
Show resolved Hide resolved
if (Search._queued_query !== null) {
const query = Search._queued_query;
Search._queued_query = null;
Expand Down Expand Up @@ -335,7 +344,7 @@ const Search = {
const queryLower = query.toLowerCase().trim();
for (const [title, foundTitles] of Object.entries(allTitles)) {
if (title.toLowerCase().trim().includes(queryLower) && (queryLower.length >= title.length/2)) {
for (const [file, id] of foundTitles) {
for (const [file, id] of Object.values(foundTitles)) {
const score = Math.round(Scorer.title * queryLower.length / title.length);
const boost = titles[file] === title ? 1 : 0; // add a boost for document titles
normalResults.push([
Expand All @@ -354,7 +363,7 @@ const Search = {
// search for explicit entries in index directives
for (const [entry, foundEntries] of Object.entries(indexEntries)) {
if (entry.includes(queryLower) && (queryLower.length >= entry.length/2)) {
for (const [file, id, isMain] of foundEntries) {
for (const [file, id, isMain] of Object.values(foundEntries)) {
const score = Math.round(100 * queryLower.length / entry.length);
const result = [
docNames[file],
Expand Down Expand Up @@ -474,7 +483,7 @@ const Search = {
const descr = objName + _(", in ") + title;

// add custom score for some objects according to scorer
if (Scorer.objPrio.hasOwnProperty(match[2]))
if (match[2] in Scorer.objPrio)
score += Scorer.objPrio[match[2]];
else score += Scorer.objPrioDefault;

Expand All @@ -489,7 +498,7 @@ const Search = {
]);
};
Object.keys(objects).forEach((prefix) =>
objects[prefix].forEach((array) =>
Object.values(objects[prefix]).forEach((array) =>
objectSearchCallback(prefix, array)
)
);
Expand Down Expand Up @@ -520,13 +529,13 @@ const Search = {
// add support for partial matches
if (word.length > 2) {
const escapedWord = _escapeRegExp(word);
if (!terms.hasOwnProperty(word)) {
if (!(word in terms)) {
Object.keys(terms).forEach((term) => {
if (term.match(escapedWord))
arr.push({ files: terms[term], score: Scorer.partialTerm });
});
}
if (!titleTerms.hasOwnProperty(word)) {
if (!(word in titleTerms)) {
Object.keys(titleTerms).forEach((term) => {
if (term.match(escapedWord))
arr.push({ files: titleTerms[term], score: Scorer.partialTitle });
Expand Down
2 changes: 1 addition & 1 deletion tests/js/fixtures/cpp/searchindex.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tests/js/fixtures/multiterm/searchindex.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tests/js/fixtures/partial/searchindex.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tests/js/fixtures/titles/searchindex.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 29 additions & 0 deletions tests/js/searchtools.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,35 @@ describe('Basic html theme search', function() {
expect(nextExpected).toEqual(undefined);
}

describe('index immutability', function() {

it('should not be possible to modify index contents', function() {
eval(loadFixture("cpp/searchindex.js"));

// record some initial state
const initialTitle = Search._index.titles[0];
const initialTitlesProto = Search._index.titles.__proto__;
const initialTerms = Search._index.terms;
const initialDocNames = [...Search._index.docnames];

// attempt to mutate the index state
try { Search._index.docnames.pop(); } catch {};
try { Search._index.docnames.push('extra'); } catch {};
Search._index.titles[0] += 'modified';
Search._index.titles.__proto__ = 'anotherProto';
Search._index.terms = {'fake': [1, 2, 3]};
Search._index.__proto__ = 'otherProto';

// ensure that none of the modifications were applied
expect(Search._index.__proto__).toBe(undefined);
expect(Search._index.terms).toEqual(initialTerms);
expect(Search._index.titles.__proto__).toEqual(initialTitlesProto);
expect(Search._index.titles[0]).toEqual(initialTitle);
expect(Search._index.docnames).toEqual(initialDocNames);
});

});

describe('terms search', function() {

it('should find "C++" when in index', function() {
Expand Down
Loading
Loading