spellbook

mirror of https://github.com/helix-editor/spellbook.git synced 2025-10-06 00:02:48 +02:00

Author	SHA1	Message	Date
Michael Davis	55b4d302fb	HashBag: Remove unused `get` helper	2024-08-22 18:30:27 -04:00
Michael Davis	c92f1e33b3	aff parser: Improve coverage	2024-08-22 15:11:27 -04:00
Michael Davis	83b32fa298	Move shell definition into flake, use flake-compat	2024-08-22 13:53:39 -04:00
Michael Davis	66608bec2b	Add a blank impl of `check_compound_with_pattern_replacements` This function only does something when CompoundResult has a replacement field on it. In my searching I didn't see a dictionary that actually uses that so we might want to hold off on implementing that function (which is really very complex in the Nuspell codebase, with lots of `goto`) until we find a language that uses this.	2024-08-22 10:55:24 -04:00
Michael Davis	77d77d1fa9	checker: Fix lifetimes on CompoundResult returns These have to live as long as the `word` and `&self` references. We can elide the lifetimes because of that: the compiler will infer that the reference needs to live as long as the intersection of the references in the domain.	2024-08-22 10:55:18 -04:00
Michael Davis	68bfc779cf	minor: Fix typo	2024-08-22 10:51:48 -04:00
Michael Davis	73af8e8965	Raise local version of Rust to 1.74.0 Ideally we would keep this at the MSRV but rust-analyzer now doesn't work with anything older than this version. We had to jump up to 1.70 to get `block_box` to run the benchmarks anyways.	2024-08-22 09:54:49 -04:00
Michael Davis	d8f0ce0b81	Parse CompoundPatterns	2024-08-22 09:54:42 -04:00
Michael Davis	19ff354485	Update flake inputs	2024-08-22 09:47:02 -04:00
Michael Davis	ee2acd87f0	Use Box<str> for StringPair=>StrPair type	2024-08-22 09:25:52 -04:00
Michael Davis	c767cc7d99	Add a benchmark for a number word	2024-08-22 09:25:11 -04:00
Michael Davis	7e7678cb4f	Perform case conversion based on locale A few Turkic locales have special conversion rules for 'i' and 'I' which we need to handle. Nuspell covers this with ICU but we don't want to pull in ICU4X - it's a very large dependency and adds weight even if we just pull in the case mapping parts. Luckily the code to do this ourselves (like Hunspell does too) is very small.	2024-08-21 17:19:20 -04:00
Michael Davis	fe902cec15	Rename checker::Casing to Capitalization To avoid confusion with the Casing enum we'll introduce in the child commit.	2024-08-21 17:19:07 -04:00
Michael Davis	5097053fab	checker: Refactor classic compounding to be done like compound rules	2024-08-21 16:27:59 -04:00
Michael Davis	02fef71635	Rename HashMultiMap type to HashBag	2024-08-21 15:33:11 -04:00
Michael Davis	53fadffd9c	style: Use a Lazy static for test en_US dictionary	2024-08-21 14:16:56 -04:00
Michael Davis	c1407d65fc	checker: Improve sharp casing unit test	2024-08-21 12:37:09 -04:00
Michael Davis	3ef6b252ad	Add a few minor unit tests for coverage	2024-08-21 12:14:56 -04:00
Michael Davis	1b15077f3f	minor: Clean up imports	2024-08-21 11:27:34 -04:00
Michael Davis	73d83fcf7c	shell: Add llvm-tools-preview and cargo-llvm-cov	2024-08-21 10:32:32 -04:00
Michael Davis	7cc1f04aad	style: Replace BreakTable's From impl on a Vec with new on a slice	2024-08-21 10:32:31 -04:00
Michael Davis	325b2d8ff3	Implement check for COMPOUNDRULE compounds This fixes check for compounds in en_US like "10th" and "202nd". These are considered valid because of the COMPOUNDRULE rule. From en_US.aff: # compound rules: # 1. [0-9]1[0-9]th (10th, 11th, 12th, 56714th, etc.) # 2. [0-9][02-9](1st\|2nd\|3rd\|[4-9]th) (21st, 22nd, 123rd, 1234th, etc.) COMPOUNDRULE 2 COMPOUNDRULE n1t COMPOUNDRULE nmp For example with dictionary entries "1/n1" and "0th/pt", we split up the word "10th" into parts "1" and "0th" which we look up in the dictionary. We then check those words flagsets against the patterns. "10th" would be represented as: `&[&flagset!['n', '1'], &flagset!['p', 't']]` That matches the pattern `n*1t`: zero or more 'n' flags and then a '1' flag (used for "1" and other digits) and a 't' flag ("0th"). The 'n' flag matches zero times for "10th" but allows other numbers in front. For example this pattern would also match "110th". These kinds of compounds are fairly straightforward to check compared to the other kind of compounding. This is the only compounding used by en_US and a few other dictionaries. This completes support for en_US for the checker - nothing else in the aff affects the checker.	2024-08-21 10:29:41 -04:00
Michael Davis	0ae891d3e4	aff: Use boxed slices for CompoundRule types	2024-08-20 09:02:17 -04:00
Michael Davis	114ede1fa1	Fix some typos Now that spellbook runs in Helix :D	2024-08-19 12:01:03 -04:00
Michael Davis	883792fcd0	minor: Use slice reference for parser list With the reference we don't need to update the number of entries when we add a parser. Tip taken from nucleo's `case_fold` module via Pascal.	2024-08-18 21:50:32 -04:00
Michael Davis	70c9af5631	Support ICONV and at least parse OCONV (OCONV is used in the suggest API.)	2024-08-18 21:42:17 -04:00
Michael Davis	6e5ee0492d	Add other casings to the benchmarks	2024-08-18 10:09:45 -04:00
Michael Davis	c38fa46853	Add special case for upper words with apostrophes	2024-08-18 09:56:45 -04:00
Michael Davis	4482c259b4	minor: Style improvements for spell_break	2024-08-18 09:56:45 -04:00
Michael Davis	6826683d5c	minor: Update docs and comments	2024-08-18 09:24:52 -04:00
Michael Davis	7fabe38b00	minor: Derive default on enums instead of manual implementations	2024-08-18 09:17:34 -04:00
Michael Davis	fcdfe7d8ae	Use boxed slices for BreakTable representation	2024-08-18 09:16:42 -04:00
Michael Davis	1cdaee8a3a	minor: Resolve clippy lint about type bounds	2024-08-17 14:35:15 -04:00
Michael Davis	0ba85188ef	minor: Improve panic docs	2024-08-17 14:27:50 -04:00
Michael Davis	8845e5df5d	minor: Style and Debug derives	2024-08-17 14:27:29 -04:00
Michael Davis	2479466cfe	Add special case for uppercase words with sharps in CHECKSHARPS	2024-08-17 14:27:11 -04:00
Michael Davis	2eceeda0be	checker: Basic support for uppercase word checking	2024-08-17 13:21:16 -04:00
Michael Davis	7e4c1ac63d	aff: Add COMPOUNDROOT	2024-08-17 12:11:45 -04:00
Michael Davis	f936ea1d98	Use Option<NonZeroU16>s for aff count/length options	2024-08-17 12:11:45 -04:00
Michael Davis	d2231b837c	checker: Add a simple route for titlecase words This is not fully correct because we don't do case conversion completely accurately.	2024-08-17 12:11:45 -04:00
Michael Davis	5553b5b7eb	checker: Fix nonsensical lifetimes	2024-08-17 12:09:24 -04:00
Michael Davis	303b92075b	Add a very naive example for checking prose	2024-08-17 12:09:24 -04:00
Michael Davis	4a4ae680d6	Store the word list stem on CompoundingResult & AffixForm This approximately matches Nuspell's storage of the wordlist pointer. To make the lifetimes play nice we return the key from the map as well as the value. This value is equal to the `stem2`/`stem3`s in the affixing functions but we need to return the data from the map to use that liftime rather than the Cows produced by stripping and adding with affixes. As another happy consequence of this, we can drop the `'aff` lifetime on the `word: &str` parameters in the affixing stripping functions. The input word should have a distinct lifetime and not having this lifetime will cause problems for the compounding functions introduced in the child commits.	2024-08-12 12:01:52 -04:00
Michael Davis	ea520a7407	Port the word lookup function for compounds	2024-08-12 12:01:52 -04:00
Michael Davis	7551efe438	Port are_three_code_points_equal from Nuspell	2024-08-12 12:01:51 -04:00
Michael Davis	c27a8249a2	Use const generics for affixing mode This mirrors Nuspell's use of C++ template parameters.	2024-08-12 12:01:51 -04:00
Michael Davis	1376b8f913	Use str::char_indices instead of counting char bytes	2024-03-28 16:43:36 -04:00
Michael Davis	b5201915f8	Add basic shells for intro compounding functions	2024-03-28 16:43:28 -04:00
Michael Davis	bf0abd026a	checker: Check COMPLEXPREFIXES prefixing rules	2024-03-25 18:35:32 -04:00
Michael Davis	5d3e8c8a3f	checker: Strip a suffix, then prefix, then suffix	2024-03-24 15:44:48 -04:00

... 3 4 5 6 7

321 Commits