Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
waylan committed Mar 7, 2024
1 parent 9d5d813 commit c4a139f
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 0 deletions.
19 changes: 19 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [unreleased]

### Changed

#### Refactor TOC Sanitation

* All postprocessors are run on heading content.
* Footnote references are stripped from heading content. Fixes #660.
* A more robust `striptags` is provided to convert headings to plain text.
Unlike, markupsafe's implementation, HTML entities are not unescaped.
* The plain text `name`, rich `html` and unescaped raw `data-toc-label` are
saved to `toc_tokens`, allowing users to access the full rich text content of
the headings directly from `toc_tokens`.
* `data-toc-label` is sanitized separate from heading content.
* A `html.unescape` call is made just prior to calling `slugify` so that
`slugify` only operates on Unicode characters. Note that `html.unescape` is
not run on the `name` or `html`.
* The `get_name` and `stashedHTML2text` functions defined in the `toc` extension
are both **deprecated**. Instead, use some combination of `run_postprocessors`,
`render_inner_html` and `striptags`.

### Fixed

* Include `scripts/*.py` in the generated source tarballs (#1430).
Expand Down
11 changes: 11 additions & 0 deletions docs/extensions/toc.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ the following object at `md.toc_tokens`:
'level': 1,
'id': 'header-1',
'name': 'Header 1',
'html': 'Header 1',
'data-toc-label': '',
'children': [
{'level': 2, 'id': 'header-2', 'name': 'Header 2', 'children':[]}
]
Expand All @@ -91,6 +93,11 @@ Note that the `level` refers to the `hn` level. In other words, `<h1>` is level
`1` and `<h2>` is level `2`, etc. Be aware that improperly nested levels in the
input may result in odd nesting of the output.

`name` is the sanitized value which would also be used as a label for the HTML
version of the Table of Contents. `html` contains the fully rendered HTML
content of the heading and has not been sanitized in any way. This may be used
with your own custom sanitation to create custom table of contents.

### Custom Labels

In most cases, the text label in the Table of Contents should match the text of
Expand Down Expand Up @@ -131,6 +138,10 @@ attribute list to provide a cleaner URL when linking to the header. If the ID is
not manually defined, it is always derived from the text of the header, never
from the `data-toc-label` attribute.

The value of the `data-toc-label` attribute is sanitized and stripped of any HTML
tags. However, `toc_tokens` will contain the raw content under
`data-toc-label`.

Usage
-----

Expand Down

0 comments on commit c4a139f

Please sign in to comment.