Apache-2.0 on its own doesn't list the copyright holder since you are
not meant to edit the license text (unlike the MIT for example). So the
license file alone isn't enough, we also need the notice (recently
pushed upstream to tree-sitter-erlang). Also see the NOTICE in
tree-sitter-elixir which is also Apache-2.0 licensed.
The number of parent highlights can decrease after entering an injection
of a child layer - if those highlights end - so when looking up the
highlights of the current layer we need to be careful to clamp the
parent layer's `parent_highlights` index to the current length of
`active_highlights`.
Co-authored-by: Peter Retzlaff <pe.retzlaff@gmail.com>
In the test cases the edoc layer (the backtick and single quote within
the comments) is recognized by `QueryIter` as being finished after it
highlights the inline code block - its only capture. The highlighter
needs to "deactivate" the layer rather than finishing (and removing it)
however, because the code block highlight extends to the next comment.
So when exiting an injection we check that the `QueryIter` considers a
layer to be finished and also that all highlights in the layer are done
as well.
This fixes a case that pops up with some specific CSS highlights which
causes the `active_highlights` stack to become out of order. The odd
highlight is this:
"#" @punctuation
((color_value) "#") @string.special
(color_value) @string.special
The `#` node is fought over by the first two patterns. Requiring that
the pattern matches `#` in the second pattern causes tree-sitter to
finish its capture after the first pattern, so we capture the child node
`{Node # 9..10}` first and then `{Node color_value 9..13}` - reversed of
the normal ordering.
In this case we need to maintain the ordering of `active_highlights` by
`Vec::insert`ing into the correct position.
The errors were collected into the string but `errors` was never read.
Instead of collecting errors I think it's better for the sake of
debugging to panic eagerly.
Because of the way that cursors are cached, we need to set or reset
the byte range and match limit when re-using from the cache. In most
cases callers should set a byte range and match limit, so rather than
resetting these we require them in `new`.
This fixes an issue where an edit to a document which removes an
injection layer could eventually cause a panic that `parent_ranges` is
empty in `intersect_ranges`: `Option::unwrap` on a `None` value.
The test case edits a markdown block where a codefence is created
between two HTML comments. Before the edit, the document has a combined
injection for the two HTML comments. After the edit, the second HTML
comment is reinterpreted by the markdown parser as the content of the
codefence - not an HTML comment anymore.
In this case the HTML layer must be re-parsed because its included
ranges have been changed the edit: before it was a range for each
comment and after it is only the first comment. Previously we did not
mark the HTML layer as modified since it was not directly edited. But
this meant that, when running the HTML injection query over this layer
after the edit, we got query matches for the second HTML comment since
the node still existed according to the layer's tree. This second
injection capture did not intersect with any parent ranges since the
layer's ranges no longer included the second HTML comment, causing
`parent_ranges` to be empty and an unwrap of `None`.
The fix is to mark any layers which are not reused as modified. If a
layer is a combined injection and not all layers are reused then the
layer must be re-parsed because its included ranges have changed. All
modified layers are re-parsed. To do this we look at the old injections
for each layer and any that are skipped or discarded rather than reused
have their layer marked as modified.
Note that this does not cause layers which have been removed from the
tree to stick around. The layers are only marked as modified (meaning
that they need to be re-parsed) and not marked as touched (meaning that
they are still valid).
Also note that this is not an issue for adding injection ranges to a
combined injection since we always consider a layer to be modified if
we cannot reuse an injection.
A practical reproduction of this (and the source of the test) is
mentioned in helix-editor/helix#13544.
Previously the `QueryIter` dropped layers which had no remaining
captures but the layer could potentially have more injections. In this
case the `QueryIter` would drop the layer and then recreate it later
when the parent layer entered another injection for this layer. This
could be potentially wasteful if the `QueryIter`'s `LayerState` type has
an expensive `Default` implementation.
We should prefer to emit `QueryIterEvent::ExitInjection` with `Some`
state (meaning that the layer is done and the state can be dropped) only
when the layer will not be entered again.
This fixes a bug where a layer could be removed from `QueryIter`'s
`active_layers` because it had no more captures and then reinitialized
later, restarting the query cursor and producing duplicate captures.
The new test case exhibits this behavior. Consider this markdown.
```rust
/// `Something`
/// Anything
```
This consists of four layers:
* The root layer, the entire markdown document
* The Rust injection in the codefence
* Two markdown injections within the Rust doc comments. These injections
are tagged with `injection.combined` so they form one markdown layer.
* Two markdown.inline injections within those markdown injections. Again
these injections are combined so they form one layer.
Layers are removed from `QueryIter` when they have no more captures. In
this example code the "`Something`" was the only capture in the two
injections of the markdown.inline layer, so after the first injection
was finished, the layer would be removed. Once the `QueryIter` reached
the "Anything" region, though, it would create a new `ActiveLayer` for
the markdown.inline layer and recreate its query cursor. Then the query
cursor would emit the same highlights as it did previously for
"`Something`".
This breaks an invariant that the `Highlighter::next_event_offset` must
be monotonically increasing. The added test case failed with:
thread 'tests::codefence_rust_doc_comments' panicked at /.../ropey-1.6.1/src/slice.rs:703:23:
byte_slice(): Invalid byte range 27..12: start must be <= end
Also see helix-editor/helix#13569.
The solution is to "tombstone" any layers which are finished in a new
`HashSet<Layer>` and avoid recreating query cursors for any finished
layers. Outside of this bug, this also prevents us from needlessly
creating query cursors which we know will yield nothing. Query cursors
are reused in a thread-local cache, so by not creating a query cursor we
allow another caller to use the cached cursor.
Cloning the properties can be avoided and it is relatively expensive in
terms of temporary allocations. The nice part of cloning was being able
to use `unwrap_or_default` to get default injection properties for
injections which don't use any `#set! injection.X` properties but we can
handle this where the properties are accessed instead.
Closes#12