rST parser: new title-style hierarchy for nested parsing with detached base node.

Use a separate section title style hierarchy for nested parsing
if the base node is not attached to the document.

With a detached base, the parser can not move up in the section hierarchy, so it
does not make sense to enforce the document-wide title style hierarchy.

Separate title styles also fit better with directives that fetch their content
from separate sources.
Docutils does not switch on section support in nested parsing.
Sphinx defines and uses a "_fresh_title_style_context" in nested parsing.
Some contributed Sphinx extensions reported regeressions with the new
section parsing algorightm because the old algorithm was forgiving in case
of inconsistent title styles in nested parsing (cf. [bugs:#508]).

git-svn-id: http://svn.code.sf.net/p/docutils/code/trunk@10204 929543f6-e4f2-0310-98a6-ba3bd3dd1d04
This commit is contained in:
milde
2025-08-15 22:28:45 +00:00
parent c3a99f65a7
commit 242c50ae60
6 changed files with 95 additions and 10 deletions

View File

@@ -31,6 +31,8 @@ Release 0.23b0 (unpublished)
- Relax "section title" system message from SEVERE to ERROR.
- Ensure new "current node" is valid when switching section level
(cf. bugs #508 and #509).
- `NestedStateMachine.run()` uses a separate title style hierarchy
if the base node is not attached to the document (cf. bug #508).
Release 0.22 (2026-07-29)

View File

@@ -260,6 +260,10 @@ Misc
Release 0.23b0 (unpublished)
============================
reStructuredText parser:
Nested parsing uses a separate title style hierarchy
if the base node is not attached to the document.
Bugfixes and improvements (see HISTORY_).

View File

@@ -1646,6 +1646,9 @@ The "include" directive recognizes the following options:
Parse the included content with the specified parser.
See the `"parser" configuration setting`_ for available parsers.
Starts a new "`section hierarchy`_" (all sections in the included
content become subsections of the current section).
.. Caution::
There is is no check whether the inserted elements are valid at the
point of insertion. It is recommended to validate_ the document.
@@ -2310,10 +2313,11 @@ Common Option Value Types
.. _hyperlink references: restructuredtext.html#hyperlink-references
.. _hyperlink targets:
.. _hyperlink target: restructuredtext.html#hyperlink-targets
.. _supported length units: restructuredtext.html#length-units
.. _reference name:
.. _reference names: restructuredtext.html#reference-names
.. _section hierarchy: restructuredtext.html#sections
.. _simple table: restructuredtext.html#simple-tables
.. _supported length units: restructuredtext.html#length-units
.. _reStructuredText Interpreted Text Roles:
.. _interpreted text role: roles.html

View File

@@ -599,7 +599,7 @@ subsection, etc.).
All section title styles need not be used, nor need any specific
section title style be used. However, a document must be consistent
in its use of section titles: once a hierarchy of title styles is
established, sections must use that hierarchy.
established, sections must use that hierarchy. [#]_
Each section title automatically generates a hyperlink target pointing
to the section. The text of the hyperlink target (the "reference
@@ -609,6 +609,13 @@ Hyperlink Targets`_ for a complete description.
Sections may contain `body elements`_, transitions_, and nested
sections.
.. [#] Directives_ may establish a separate hierarchy of title styles
for their content. This is handy for directives that include
content from separate sources, e.g., the directives provided by
the `"autodoc" Sphinx extension`__.
__ https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html
Transitions
-----------

View File

@@ -104,6 +104,7 @@ from __future__ import annotations
__docformat__ = 'reStructuredText'
import copy
import re
from types import FunctionType, MethodType
from types import SimpleNamespace as Struct
@@ -185,15 +186,26 @@ class NestedStateMachine(StateMachineWS):
"""
Parse `input_lines` and populate `node`.
Use a separate "title style hierarchy" if `node` is not
attached to the document (changed in Docutils 0.23).
Extend `StateMachineWS.run()`: set up document-wide data.
"""
self.match_titles = match_titles
self.memo = memo
self.memo = copy.copy(memo)
self.document = memo.document
self.attach_observer(self.document.note_source)
self.language = memo.language
self.reporter = self.document.reporter
self.node = node
if match_titles:
# Start a new title style hierarchy if `node` is not
# a descendant of the `document`:
_root = node
while _root.parent is not None:
_root = _root.parent
if _root != self.document:
self.memo.title_styles = []
results = StateMachineWS.run(self, input_lines, input_offset)
assert results == [], ('NestedStateMachine.run() results should be '
'empty!')
@@ -270,13 +282,16 @@ class RSTState(StateWS):
:input_offset:
Line number at start of the block.
:node:
Root node. Generated nodes will be appended to this node
(unless a new section with lower level is encountered).
Base node. Generated nodes will be appended to this node
(unless a new section with lower level is encountered, see below).
:match_titles:
Allow section titles?
If True, `node` should be attached to the document
so that section levels can be computed correctly
and moving up in the section hierarchy works.
If the base `node` is attached to the document, new sections will
be appended according their level in the section hierarchy
(moving up the tree).
If the base `node` is *not* attached to the document,
a separate section title style hierarchy is used for the nested
parsing (all sections are subsections of the current section).
:state_machine_class:
Default: `NestedStateMachine`.
:state_machine_kwargs:
@@ -326,9 +341,15 @@ class RSTState(StateWS):
state_machine_class=None,
state_machine_kwargs=None):
"""
Create a new StateMachine rooted at `node` and run it over the input
`block`. Also keep track of optional intermediate blank lines and the
Parse the input `block` with a nested state-machine rooted at `node`.
Create a new StateMachine rooted at `node` and run it over the
input `block` (see also `nested_parse()`).
Also keep track of optional intermediate blank lines and the
required final one.
Return new offset and a boolean indicating whether there was a
blank final line.
"""
if state_machine_class is None:
state_machine_class = self.nested_sm

View File

@@ -151,6 +151,53 @@ class RSTStateTests(unittest.TestCase):
' sub 2\n',
section.pformat())
def test_nested_parse_with_sections_detached(self):
# The base `node` does not need to be attached to the document.
# "global" title style hierarchy (ignored with detached base node)
self.machine.memo.title_styles = ['-', '~']
# base `node` is a <paragraph> without parents
base = nodes.paragraph('')
base.document = self.document # this is not "attaching"
# level-2 title style
title = self.title_markup('sub', '~')
# new hierarchy -> attach <section> to base `node`
self.state.nested_parse(title, 0, node=base, match_titles=True)
self.assertEqual('<paragraph>\n'
' <section ids="sub" names="sub">\n'
' <title>\n'
' sub\n',
base.pformat())
# It is the users responsibility to ensure that the base node
# may contain a <section> (or move the section after parsing).
# You may check with `validate()`:
with self.assertRaises(nodes.ValidationError):
base.validate()
# a new hierarchy is used in every call of nested_parse()
# parse 2 section titles
title = self.title_markup('new', '*') + self.title_markup('top', '-')
# new hierarchy -> attach section and sub-section to base node
self.state.nested_parse(title, 0, node=base, match_titles=True)
self.assertEqual(
'<paragraph>\n'
' <section ids="sub" names="sub">\n'
' <title>\n'
' sub\n'
' <section ids="new" names="new">\n'
' <title>\n'
' new\n'
' <section ids="top" names="top">\n'
' <title>\n'
' top\n',
base.pformat())
# document-wide style hierarchy unchanged:
self.assertEqual(['-', '~'], self.machine.memo.title_styles)
# print(self.document.pformat())
# print(base.pformat())
if __name__ == '__main__':
unittest.main()