mirror of
https://github.com/helix-editor/spellbook.git
synced 2025-10-06 00:02:48 +02:00
compare doc: Copy edits
This commit is contained in:
@@ -59,15 +59,15 @@ Writes: 2,105,758 bytes
|
||||
|
||||
Mostly I am familiar with Nuspell so I'll be talking about Spellbook vs. Nuspell in this section.
|
||||
|
||||
The `check` code is basically a rewrite so they should perform very similarly. One major difference that might affect lookup time is the main lookup table. It's meant to be a hash multi-map, like a `HashMap<String, FlagSet>` but allowing duplicate keys. Nuspell rolls its own hash table type for this while Spellbook uses `hashbrown::HashTable` which is highly optimized. Spellbook also uses `ahash` by default which is quite fast while Nuspell uses `std::hash` (implementation-specific). This sometimes happens with Rust rewrites: it's a pain to take a dependency in C/C++ so C/C++ libraries/tools often leave performance on the table by not taking advantage of available high-performance dependencies. To confirm or deny this suspicion one could replace Nuspell's `Word_List` type with an adaptation from Google's `SwissTable` library (on which `hashbrown` is based).
|
||||
The `check` code is basically a rewrite so they should perform very similarly. One major difference that might affect lookup time is the main lookup table. It's meant to be a hash multi-map, like a `HashMap<String, FlagSet>` but allowing duplicate keys. Nuspell rolls its own hash table type for this while Spellbook uses `hashbrown::HashTable` which has SIMD optimizations for searching. Spellbook also uses `ahash` by default which is quite fast while Nuspell uses `std::hash` (implementation-specific). This sometimes happens with Rust rewrites: it's a pain to take a dependency in C/C++ so C/C++ libraries/tools might leave performance on the table by not taking advantage of available high-performance dependencies. To confirm or deny this suspicion one could replace Nuspell's `Word_List` type with an adaptation from Google's `SwissTable` library (on which `hashbrown` is based).
|
||||
|
||||
Otherwise I suspect that Rust's standard library has better optimizations for string searching and equality, as I know it uses `memchr` and SIMD operations when available.
|
||||
|
||||
When it comes to memory, Spellbook is optimized to save memory by cutting out unnecessary bytes from the common string type used in the loop table, as well as small-string and small-slice optimizations for the stem and flagsets. The [internals] document has more details.
|
||||
When it comes to memory, Spellbook is optimized to save memory by cutting out unnecessary bytes from the common string type used in the lookup table, as well as small-string and small-slice optimizations for the stem and flagsets. The [internals] document has more details.
|
||||
|
||||
## ZSpell
|
||||
|
||||
[`pluots/zspell`](https://github.com/pluots/zspell) is an interesting alternative to the Hunspell-like spellcheckers mentioned above. At time of writing ZSpell doesn't support suggestions. The interesting part of ZSpell is how it checks words instead.
|
||||
[`pluots/zspell`](https://github.com/pluots/zspell) is an interesting alternative to the Hunspell-like spellcheckers mentioned above. ZSpell also takes the `.dic` and `.aff` Hunspell-style dictionary files. At time of writing ZSpell doesn't support suggestions. The interesting part of ZSpell is how it checks words instead.
|
||||
|
||||
ZSpell expands affixes during instantiation of a dictionary. (See the `README.md` doc in this repository for a basic intro on affixes.) The "classic" spellcheckers mentioned above contain a subset of the possible dictionary words in a main lookup table. For example Spellbook's table includes "adventure" but not some of its conjugations made possible by prefixes/suffixes like "adventurer" or "adventured". In contrast, ZSpell expands each stem so that its tables include "adventure", "adventures", "adventurer", "adventure", "adventuring" and more. When checking a word, ZSpell performs a lookup into (up to) a handful of hash maps.
|
||||
|
||||
|
Reference in New Issue
Block a user