Skip to content

Commit 07bf207

Browse files
authored
Order footnotes by reference
* Alter footnote ordering so footnotes definitions are listed in the order in which their references appear in the document. * Add config option, USE_DEFINITION_ORDER, to support previous behavior. * Add comprehensive tests for extension. * Change to the behavior of inlinepatterns by ensuring that inlinepatterns iterate through elements in document order. Previously, in some specific case, elements with nested children had their inline content parsed in reverse order. Resolves #1367.
1 parent 23c301d commit 07bf207

File tree

5 files changed

+452
-27
lines changed

5 files changed

+452
-27
lines changed

docs/changelog.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,16 @@ See the [Contributing Guide](contributing.md) for details.
1212

1313
## [Unreleased]
1414

15+
### Changed
16+
17+
* Footnotes are now ordered by the occurrence of their references in the
18+
document. A new configuration option for the footnotes extension,
19+
`USE_DEFINITION_ORDER`, has been added to support restoring the previous
20+
behavior of ordering footnotes by the occurrence of definitions.
21+
1522
### Fixed
1623

24+
* Ensure inline processing iterates through elements in document order.
1725
* Fix handling of incomplete HTML tags in code spans in Python 3.14.
1826

1927
## [3.8.2] - 2025-06-19

docs/extensions/footnotes.md

Lines changed: 34 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -24,26 +24,33 @@ the output.
2424
Example:
2525

2626
```md
27-
Footnotes[^1] have a label[^@#$%] and the footnote's content.
27+
Footnotes have a name, a reference[^1], and a definition[^word].
2828

29-
[^1]: This is a footnote content.
30-
[^@#$%]: A footnote on the label: "@#$%".
29+
[^1]: This is a footnote definition.
30+
[^word]: A footnote with the name "word".
3131
```
3232

33-
A footnote label must start with a caret `^` and may contain any inline text
34-
(including spaces) between a set of square brackets `[]`. Only the first
35-
caret has any special meaning.
36-
37-
A footnote content must start with the label followed by a colon and at least
38-
one space. The label used to define the content must exactly match the label used
39-
in the body (including capitalization and white space). The content would then
40-
follow the label either on the same line or on the next line. The content may
41-
contain multiple lines, paragraphs, code blocks, blockquotes and most any other
42-
markdown syntax. The additional lines must be indented one level (four spaces or
43-
one tab).
44-
45-
When working with multiple blocks, it may be helpful to start the content on a
46-
separate line from the label which defines the content. This way the entire block
33+
A **footnote name** is a string that uniquely identifies a footnote within the
34+
document. It may contain any character which is valid for an HTML id attribute
35+
(including spaces). Examples: `1` in `[^1]`, `word` in `[^word]`,
36+
and `@#$%` in `[^@#$%]`.
37+
38+
A **footnote reference** is a link within the text body to a footnote definition.
39+
A footnote reference contains the footnote name prefixed by a caret `^` and enclosed
40+
in square brackets `[]`. Examples: `[^1]` and `[^@#$%]`. In the output, footnote
41+
references are replaced by a superscript number that links to the footnote definition.
42+
43+
A **footnote definition** must start with the corresponding footnote reference
44+
followed by a colon and at least one space. The reference must exactly match
45+
the reference used in the body (including capitalization and white space).
46+
The content of the definition would then follow either on the same line
47+
(`[^1]: This is a footnote definition.`) or on the next line.
48+
Footnote definitions may contain multiple lines, paragraphs, code blocks,
49+
blockquotes and most any other markdown syntax. The additional lines must be
50+
indented one level (four spaces or one tab).
51+
52+
When working with multiple blocks, it may be helpful to start the definition on a
53+
separate line from the reference which defines the content. This way the entire block
4754
is indented consistently and any errors are more easily discernible by the author.
4855

4956
```md
@@ -98,6 +105,15 @@ The following options are provided to configure the output:
98105
* **`SEPARATOR`**:
99106
The text string used to set the footnote separator. Defaults to `:`.
100107

108+
* **`USE_DEFINITION_ORDER`**:
109+
Whether to order footnotes by the occurrence of footnote definitions
110+
in the document. Defaults to `False`.
111+
112+
Introduced in version 3.9.0, this option allows footnotes to be ordered
113+
by the occurrence of their definitions in the document, rather than by the
114+
order of their references in the text. This was the behavior of
115+
previous versions of the extension.
116+
101117
A trivial example:
102118

103119
```python
@@ -109,7 +125,7 @@ Resetting Instance State
109125

110126
Footnote definitions are stored within the `markdown.Markdown` class instance between
111127
multiple runs of the class. This allows footnotes from all runs to be included in
112-
output, with links and references that are unique, even though the class has been
128+
output, with links and references that are unique, even though the class has been
113129
called multiple times.
114130

115131
However, if needed, the definitions can be cleared between runs by calling `reset`.

markdown/extensions/footnotes.py

Lines changed: 70 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
FN_BACKLINK_TEXT = util.STX + "zz1337820767766393qq" + util.ETX
3434
NBSP_PLACEHOLDER = util.STX + "qq3936677670287331zz" + util.ETX
3535
RE_REF_ID = re.compile(r'(fnref)(\d+)')
36+
RE_REFERENCE = re.compile(r'(?<!!)\[\^([^\]]*)\](?!\s*:)')
3637

3738

3839
class FootnoteExtension(Extension):
@@ -61,6 +62,9 @@ def __init__(self, **kwargs):
6162
],
6263
'SEPARATOR': [
6364
':', 'Footnote separator.'
65+
],
66+
'USE_DEFINITION_ORDER': [
67+
False, 'Whether to order footnotes by footnote content rather than by footnote label.'
6468
]
6569
}
6670
""" Default configuration options. """
@@ -71,6 +75,9 @@ def __init__(self, **kwargs):
7175
self.found_refs: dict[str, int] = {}
7276
self.used_refs: set[str] = set()
7377

78+
# Backward compatibility with old '%d' placeholder
79+
self.setConfig('BACKLINK_TITLE', self.getConfig("BACKLINK_TITLE").replace("%d", "{}"))
80+
7481
self.reset()
7582

7683
def extendMarkdown(self, md):
@@ -89,6 +96,12 @@ def extendMarkdown(self, md):
8996
# `codehilite`) so they can run on the the contents of the div.
9097
md.treeprocessors.register(FootnoteTreeprocessor(self), 'footnote', 50)
9198

99+
# Insert a tree-processor to reorder the footnotes if necessary. This must be after
100+
# `inline` tree-processor so it can access the footnote reference order
101+
# (`self.footnote_order`) that gets populated by the `FootnoteInlineProcessor`.
102+
if not self.getConfig("USE_DEFINITION_ORDER"):
103+
md.treeprocessors.register(FootnoteReorderingProcessor(self), 'footnote-reorder', 19)
104+
92105
# Insert a tree-processor that will run after inline is done.
93106
# In this tree-processor we want to check our duplicate footnote tracker
94107
# And add additional `backrefs` to the footnote pointing back to the
@@ -100,6 +113,7 @@ def extendMarkdown(self, md):
100113

101114
def reset(self) -> None:
102115
""" Clear footnotes on reset, and prepare for distinct document. """
116+
self.footnote_order: list[str] = []
103117
self.footnotes: OrderedDict[str, str] = OrderedDict()
104118
self.unique_prefix += 1
105119
self.found_refs = {}
@@ -150,6 +164,11 @@ def setFootnote(self, id: str, text: str) -> None:
150164
""" Store a footnote for later retrieval. """
151165
self.footnotes[id] = text
152166

167+
def addFootnoteRef(self, id: str) -> None:
168+
""" Store a footnote reference id in order of appearance. """
169+
if id not in self.footnote_order:
170+
self.footnote_order.append(id)
171+
153172
def get_separator(self) -> str:
154173
""" Get the footnote separator. """
155174
return self.getConfig("SEPARATOR")
@@ -180,9 +199,6 @@ def makeFootnotesDiv(self, root: etree.Element) -> etree.Element | None:
180199
ol = etree.SubElement(div, "ol")
181200
surrogate_parent = etree.Element("div")
182201

183-
# Backward compatibility with old '%d' placeholder
184-
backlink_title = self.getConfig("BACKLINK_TITLE").replace("%d", "{}")
185-
186202
for index, id in enumerate(self.footnotes.keys(), start=1):
187203
li = etree.SubElement(ol, "li")
188204
li.set("id", self.makeFootnoteId(id))
@@ -198,7 +214,7 @@ def makeFootnotesDiv(self, root: etree.Element) -> etree.Element | None:
198214
backlink.set("class", "footnote-backref")
199215
backlink.set(
200216
"title",
201-
backlink_title.format(index)
217+
self.getConfig('BACKLINK_TITLE').format(index)
202218
)
203219
backlink.text = FN_BACKLINK_TEXT
204220

@@ -214,7 +230,7 @@ def makeFootnotesDiv(self, root: etree.Element) -> etree.Element | None:
214230

215231

216232
class FootnoteBlockProcessor(BlockProcessor):
217-
""" Find all footnote references and store for later use. """
233+
""" Find footnote definitions and store for later use. """
218234

219235
RE = re.compile(r'^[ ]{0,3}\[\^([^\]]*)\]:[ ]*(.*)$', re.MULTILINE)
220236

@@ -228,6 +244,7 @@ def test(self, parent: etree.Element, block: str) -> bool:
228244
def run(self, parent: etree.Element, blocks: list[str]) -> bool:
229245
""" Find, set, and remove footnote definitions. """
230246
block = blocks.pop(0)
247+
231248
m = self.RE.search(block)
232249
if m:
233250
id = m.group(1)
@@ -312,14 +329,21 @@ def __init__(self, pattern: str, footnotes: FootnoteExtension):
312329
def handleMatch(self, m: re.Match[str], data: str) -> tuple[etree.Element | None, int | None, int | None]:
313330
id = m.group(1)
314331
if id in self.footnotes.footnotes.keys():
332+
self.footnotes.addFootnoteRef(id)
333+
334+
if not self.footnotes.getConfig("USE_DEFINITION_ORDER"):
335+
# Order by reference
336+
footnote_num = self.footnotes.footnote_order.index(id) + 1
337+
else:
338+
# Order by definition
339+
footnote_num = list(self.footnotes.footnotes.keys()).index(id) + 1
340+
315341
sup = etree.Element("sup")
316342
a = etree.SubElement(sup, "a")
317343
sup.set('id', self.footnotes.makeFootnoteRefId(id, found=True))
318344
a.set('href', '#' + self.footnotes.makeFootnoteId(id))
319345
a.set('class', 'footnote-ref')
320-
a.text = self.footnotes.getConfig("SUPERSCRIPT_TEXT").format(
321-
list(self.footnotes.footnotes.keys()).index(id) + 1
322-
)
346+
a.text = self.footnotes.getConfig("SUPERSCRIPT_TEXT").format(footnote_num)
323347
return sup, m.start(0), m.end(0)
324348
else:
325349
return None, None, None
@@ -401,6 +425,44 @@ def run(self, root: etree.Element) -> None:
401425
root.append(footnotesDiv)
402426

403427

428+
class FootnoteReorderingProcessor(Treeprocessor):
429+
""" Reorder list items in the footnotes div. """
430+
431+
def __init__(self, footnotes: FootnoteExtension):
432+
self.footnotes = footnotes
433+
434+
def run(self, root: etree.Element) -> None:
435+
if not self.footnotes.footnotes:
436+
return
437+
if self.footnotes.footnote_order != list(self.footnotes.footnotes.keys()):
438+
for div in root.iter('div'):
439+
if div.attrib.get('class', '') == 'footnote':
440+
self.reorder_footnotes(div)
441+
break
442+
443+
def reorder_footnotes(self, parent: etree.Element) -> None:
444+
old_list = parent.find('ol')
445+
parent.remove(old_list)
446+
items = old_list.findall('li')
447+
448+
def order_by_id(li) -> int:
449+
id = li.attrib.get('id', '').split(self.footnotes.get_separator(), 1)[-1]
450+
return (
451+
self.footnotes.footnote_order.index(id)
452+
if id in self.footnotes.footnote_order
453+
else len(self.footnotes.footnotes)
454+
)
455+
456+
items = sorted(items, key=order_by_id)
457+
458+
new_list = etree.SubElement(parent, 'ol')
459+
460+
for index, item in enumerate(items, start=1):
461+
backlink = item.find('.//a[@class="footnote-backref"]')
462+
backlink.set("title", self.footnotes.getConfig("BACKLINK_TITLE").format(index))
463+
new_list.append(item)
464+
465+
404466
class FootnotePostprocessor(Postprocessor):
405467
""" Replace placeholders with html entities. """
406468
def __init__(self, footnotes: FootnoteExtension):

markdown/treeprocessors.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -368,7 +368,7 @@ def run(self, tree: etree.Element, ancestors: list[str] | None = None) -> etree.
368368
stack = [(tree, tree_parents)]
369369

370370
while stack:
371-
currElement, parents = stack.pop()
371+
currElement, parents = stack.pop(0)
372372

373373
self.ancestors = parents
374374
self.__build_ancestors(currElement, self.ancestors)

0 commit comments

Comments
 (0)