rST parser: new title-style hierarchy for nested parsing with detached base node.

Use a separate section title style hierarchy for nested parsing if the base node is not attached to the document. With a detached base, the parser can not move up in the section hierarchy, so it does not make sense to enforce the document-wide title style hierarchy. Separate title styles also fit better with directives that fetch their content from separate sources. Docutils does not switch on section support in nested parsing. Sphinx defines and uses a "_fresh_title_style_context" in nested parsing. Some contributed Sphinx extensions reported regeressions with the new section parsing algorightm because the old algorithm was forgiving in case of inconsistent title styles in nested parsing (cf. [bugs:#508]). git-svn-id: http://svn.code.sf.net/p/docutils/code/trunk@10204 929543f6-e4f2-0310-98a6-ba3bd3dd1d04
2025-10-06 00:32:41 +02:00 · 2025-08-15 22:28:45 +00:00
parent c3a99f65a7
commit 242c50ae60
6 changed files with 95 additions and 10 deletions
--- a/docutils/HISTORY.rst
+++ b/docutils/HISTORY.rst
@@ -31,6 +31,8 @@ Release 0.23b0 (unpublished)
  - Relax "section title" system message from SEVERE to ERROR.
  - Ensure new "current node" is valid when switching section level
    (cf. bugs #508 and #509).
+  - `NestedStateMachine.run()` uses a separate title style hierarchy
+    if the base node is not attached to the document (cf. bug #508).


 Release 0.22 (2026-07-29)
--- a/docutils/RELEASE-NOTES.rst
+++ b/docutils/RELEASE-NOTES.rst
@@ -260,6 +260,10 @@ Misc
 Release 0.23b0 (unpublished)
 ============================

+reStructuredText parser:
+  Nested parsing uses a separate title style hierarchy
+  if the base node is not attached to the document.
+
 Bugfixes and improvements (see HISTORY_).


--- a/docutils/docs/ref/rst/directives.rst
+++ b/docutils/docs/ref/rst/directives.rst
@@ -1646,6 +1646,9 @@ The "include" directive recognizes the following options:
    Parse the included content with the specified parser.
    See the `"parser" configuration setting`_ for available parsers.

+    Starts a new "`section hierarchy`_" (all sections in the included
+    content become subsections of the current section).
+
    .. Caution::
       There is is no check whether the inserted elements are valid at the
       point of insertion. It is recommended to validate_ the document.
@@ -2310,10 +2313,11 @@ Common Option Value Types
 .. _hyperlink references: restructuredtext.html#hyperlink-references
 .. _hyperlink targets:
 .. _hyperlink target: restructuredtext.html#hyperlink-targets
-.. _supported length units: restructuredtext.html#length-units
 .. _reference name:
 .. _reference names: restructuredtext.html#reference-names
+.. _section hierarchy: restructuredtext.html#sections
 .. _simple table: restructuredtext.html#simple-tables
+.. _supported length units: restructuredtext.html#length-units

 .. _reStructuredText Interpreted Text Roles:
 .. _interpreted text role: roles.html
--- a/docutils/docs/ref/rst/restructuredtext.rst
+++ b/docutils/docs/ref/rst/restructuredtext.rst
@@ -599,7 +599,7 @@ subsection, etc.).
 All section title styles need not be used, nor need any specific
 section title style be used.  However, a document must be consistent
 in its use of section titles: once a hierarchy of title styles is
-established, sections must use that hierarchy.
+established, sections must use that hierarchy. [#]_

 Each section title automatically generates a hyperlink target pointing
 to the section.  The text of the hyperlink target (the "reference
@@ -609,6 +609,13 @@ Hyperlink Targets`_ for a complete description.
 Sections may contain `body elements`_, transitions_, and nested
 sections.

+.. [#] Directives_ may establish a separate hierarchy of title styles
+   for their content.  This is handy for directives that include
+   content from separate sources, e.g., the directives provided by
+   the `"autodoc" Sphinx extension`__.
+
+   __ https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html
+

 Transitions
 -----------
--- a/docutils/docutils/parsers/rst/states.py
+++ b/docutils/docutils/parsers/rst/states.py
@@ -104,6 +104,7 @@ from __future__ import annotations

 __docformat__ = 'reStructuredText'

+import copy
 import re
 from types import FunctionType, MethodType
 from types import SimpleNamespace as Struct
@@ -185,15 +186,26 @@ class NestedStateMachine(StateMachineWS):
        """
        Parse `input_lines` and populate `node`.

+        Use a separate "title style hierarchy" if `node` is not
+        attached to the document (changed in Docutils 0.23).
+
        Extend `StateMachineWS.run()`: set up document-wide data.
        """
        self.match_titles = match_titles
-        self.memo = memo
+        self.memo = copy.copy(memo)
        self.document = memo.document
        self.attach_observer(self.document.note_source)
        self.language = memo.language
        self.reporter = self.document.reporter
        self.node = node
+        if match_titles:
+            # Start a new title style hierarchy if `node` is not
+            # a descendant of the `document`:
+            _root = node
+            while _root.parent is not None:
+                _root = _root.parent
+            if _root != self.document:
+                self.memo.title_styles = []
        results = StateMachineWS.run(self, input_lines, input_offset)
        assert results == [], ('NestedStateMachine.run() results should be '
                               'empty!')
@@ -270,13 +282,16 @@ class RSTState(StateWS):
        :input_offset:
            Line number at start of the block.
        :node:
-            Root node. Generated nodes will be appended to this node
-            (unless a new section with lower level is encountered).
+            Base node. Generated nodes will be appended to this node
+            (unless a new section with lower level is encountered, see below).
        :match_titles:
            Allow section titles?
-            If True, `node` should be attached to the document
-            so that section levels can be computed correctly
-            and moving up in the section hierarchy works.
+            If the base `node` is attached to the document, new sections will
+            be appended according their level in the section hierarchy
+            (moving up the tree).
+            If the base `node` is *not* attached to the document,
+            a separate section title style hierarchy is used for the nested
+            parsing (all sections are subsections of the current section).
        :state_machine_class:
            Default: `NestedStateMachine`.
        :state_machine_kwargs:
@@ -326,9 +341,15 @@ class RSTState(StateWS):
                          state_machine_class=None,
                          state_machine_kwargs=None):
        """
-        Create a new StateMachine rooted at `node` and run it over the input
-        `block`. Also keep track of optional intermediate blank lines and the
+        Parse the input `block` with a nested state-machine rooted at `node`.
+
+        Create a new StateMachine rooted at `node` and run it over the
+        input `block` (see also `nested_parse()`).
+        Also keep track of optional intermediate blank lines and the
        required final one.
+
+        Return new offset and a boolean indicating whether there was a
+        blank final line.
        """
        if state_machine_class is None:
            state_machine_class = self.nested_sm
--- a/docutils/test/test_parsers/test_rst/test_misc.py
+++ b/docutils/test/test_parsers/test_rst/test_misc.py
@@ -151,6 +151,53 @@ class RSTStateTests(unittest.TestCase):
                         '            sub 2\n',
                         section.pformat())

+    def test_nested_parse_with_sections_detached(self):
+        # The base `node` does not need to be attached to the document.
+        # "global" title style hierarchy (ignored with detached base node)
+        self.machine.memo.title_styles = ['-', '~']
+
+        # base `node` is a <paragraph> without parents
+        base = nodes.paragraph('')
+        base.document = self.document  # this is not "attaching"
+        # level-2 title style
+        title = self.title_markup('sub', '~')
+        # new hierarchy -> attach <section> to base `node`
+        self.state.nested_parse(title, 0, node=base, match_titles=True)
+        self.assertEqual('<paragraph>\n'
+                         '    <section ids="sub" names="sub">\n'
+                         '        <title>\n'
+                         '            sub\n',
+                         base.pformat())
+        # It is the users responsibility to ensure that the base node
+        # may contain a <section> (or move the section after parsing).
+        # You may check with `validate()`:
+        with self.assertRaises(nodes.ValidationError):
+            base.validate()
+
+        # a new hierarchy is used in every call of nested_parse()
+        # parse 2 section titles
+        title = self.title_markup('new', '*') + self.title_markup('top', '-')
+        # new hierarchy -> attach section and sub-section to base node
+        self.state.nested_parse(title, 0, node=base, match_titles=True)
+        self.assertEqual(
+                         '<paragraph>\n'
+                         '    <section ids="sub" names="sub">\n'
+                         '        <title>\n'
+                         '            sub\n'
+                         '    <section ids="new" names="new">\n'
+                         '        <title>\n'
+                         '            new\n'
+                         '        <section ids="top" names="top">\n'
+                         '            <title>\n'
+                         '                top\n',
+                         base.pformat())
+
+        # document-wide style hierarchy unchanged:
+        self.assertEqual(['-', '~'], self.machine.memo.title_styles)
+
+        # print(self.document.pformat())
+        # print(base.pformat())
+

 if __name__ == '__main__':
    unittest.main()