Viktor Lofgren
708caa8791
(index) Update verbatim match handling to account for matches that span multiple tags
2025-09-24 15:43:00 +02:00
Viktor Lofgren
32394f42b9
(index) Update verbatim match handling to account for matches that span multiple tags
2025-09-24 15:41:53 +02:00
Viktor Lofgren
b8e3445ce0
(index) Update verbatim match handling to account for matches that span multiple tags
2025-09-24 15:22:50 +02:00
Viktor Lofgren
17a78a7b7e
(query) Remove obsolete code
2025-09-24 15:03:08 +02:00
Viktor Lofgren
5a75dd8093
(index) Update james cook test
2025-09-24 15:02:13 +02:00
Viktor Lofgren
a9713347a0
(query) Submit all segmentations as optional matching groups
2025-09-24 15:01:59 +02:00
Viktor Lofgren
4694d36ed2
(index) Tweak ranking bonuses for partial matches
2025-09-24 15:01:29 +02:00
Viktor Lofgren
70bdd1f51e
(index) Add test case for 'captain james cook'
2025-09-24 13:27:07 +02:00
Viktor Lofgren
187b4828e6
(index) Sort doc ids passed to re-ranking
2025-09-24 13:26:53 +02:00
Viktor Lofgren
93fc14dc94
(index) Add sanity assertions to SkipListReader
2025-09-24 13:26:31 +02:00
Viktor Lofgren
fbfea8539b
(refac) Merge IndexResultScoreCalculator into IndexResultRankingService
2025-09-24 11:51:16 +02:00
Viktor Lofgren
0929d77247
(chore) Remove vestigial Serializable annotation from a few core models
...
Java serialization was briefly considered a long while ago, but it's a silly and ancient API and not something we want to use.
2025-09-24 10:42:10 +02:00
Viktor Lofgren
db8f8c1f55
(index) Fix bitmask handling in HtmlFeature
2025-09-23 10:15:01 +02:00
Viktor Lofgren
dcb2723386
(index) Fix broken test case in the "slow" collection
2025-09-23 10:13:51 +02:00
Viktor Lofgren
00c1f495f6
(index) Fix incorrect document flag bitmask handling
2025-09-23 10:12:14 +02:00
Viktor Lofgren
73a923983a
(language) Fix outdated test assertion
2025-09-22 10:30:06 +02:00
Viktor Lofgren
e9ed0c5669
(language) Fix keyword pattern matching unicode handling
2025-09-22 10:27:46 +02:00
Viktor Lofgren
5b2bec6144
(search) Fix broken tests
2025-09-22 10:17:38 +02:00
Viktor Lofgren
f26bb8e2b1
(loader) Clean up the code
...
Loader code is still kinda needlessly convoluted for what it does, but this commit makes an effort toward making it a bit easier to follow along.
2025-09-22 10:14:54 +02:00
Viktor Lofgren
4455495dc6
(system) Fix file loggers in the json config
2025-09-21 19:02:18 +02:00
Viktor Lofgren
b84d17aa51
(system) Fix file loggers in the prod config
2025-09-21 14:02:41 +02:00
Viktor Lofgren
9d008390ae
(language) Fix unicode issues in keyword extraction
2025-09-21 13:54:01 +02:00
Viktor Lofgren
a40c2a8146
(index) Partition index journal by language to speed up index construction
2025-09-21 13:53:43 +02:00
Viktor Lofgren
a3416bf48e
(query) Fix timeout settings to use ms and not s
2025-09-19 22:45:22 +02:00
Viktor Lofgren
ee2461d9fc
(query) Fix timeout settings to use ms and not us
2025-09-19 22:19:31 +02:00
Viktor Lofgren
54c91a84e3
(query) Make the query client give up if the request exceeds its configured timeout by 50%
2025-09-19 18:59:35 +02:00
Viktor Lofgren
a6371fc54c
(query) Add a timeout to the query API
2025-09-19 18:52:44 +02:00
Viktor Lofgren
8faa9a572d
(live-capture) Fix random puppeteer API churn
2025-09-19 11:15:38 +02:00
Viktor Lofgren
fdce940263
(search) Fix redundant spam in <title>
2025-09-19 10:20:14 +02:00
Viktor Lofgren
af8a13a7fb
(index) Correct file name compatibility with previous versions
2025-09-19 09:40:43 +02:00
Viktor
9e332de6b4
Merge pull request #223 from MarginaliaSearch/multilingual
...
Add support for indexing multiple languages
2025-09-19 09:12:54 +02:00
Viktor Lofgren
d457bb5d44
(index) Fix index actor initialization
2025-09-18 16:06:40 +02:00
Viktor Lofgren
c661ebb619
(refac) Move language-processing into functions
...
It's long surpassed the single-responsibility library it once was, and is as such out of place in its original location, and fits better among the function-type modules.
2025-09-18 10:30:40 +02:00
Viktor Lofgren
53e744398a
Update gitignore to exclude eclipse-generated stuff
2025-09-17 17:14:02 +02:00
Viktor Lofgren
1d71baf3e5
(search) Display search query first in title
2025-09-16 13:16:18 +02:00
Viktor Lofgren
bb5fc0f348
(language) Fix sketchy unicode handling in UnicodeNormalization
2025-09-16 12:15:09 +02:00
Viktor Lofgren
c8f112d040
(lang+search) Clean up LanguageConfiguration initialization and LangCommandD
2025-09-16 11:49:46 +02:00
Viktor Lofgren
ae31bc8498
(lang+search) Clean up LanguageConfiguration initialization and LangCommand
2025-09-16 11:47:15 +02:00
Viktor Lofgren
da5046c3bf
(lang) Remove language redirects for languages that are not configured
...
Passing an invalid &lang= to the query service leads to a harmless but ugly stacktrace. This change prevents such a request from being formed.
2025-09-16 11:05:31 +02:00
Viktor Lofgren
f67257baf2
(lang) Remove lang:... keyword during LangCommand
2025-09-16 11:01:11 +02:00
Viktor Lofgren
924fb05661
(config) Fix language config pickup
2025-09-16 10:43:27 +02:00
Viktor Lofgren
c231a82062
(search) Lang redirection works better if it's hooked in
2025-09-16 10:40:24 +02:00
Viktor Lofgren
2c1082d7f0
(search) Add notice about the current language selection to the UI
2025-09-16 10:32:13 +02:00
Viktor Lofgren
06947bd026
(search) Add redirect based on lang:-keyword in search query
...
The change also suppresses the term in the query parser so that it isn't delegated to the index as a keyword.
2025-09-16 10:00:20 +02:00
Viktor Lofgren
519aebd7c6
(process) Make the use of zookeeper based domain coordination optional
...
The zookeeper based domain coordinator has been a bit unstable and lead to rare deadlocks. As running multiple instances of the crawler is an unusual configuration, the default behavior that makes the most sense is to disable cross-process coordination and use only local coordination.
2025-09-15 19:13:57 +02:00
Viktor Lofgren
42cc27586e
(process) Reduce connection pool stats log spam
2025-09-15 18:51:43 +02:00
Viktor Lofgren
360881fafd
(setup) Pull POS tags from control svc on first boot
...
This commit also removes the old retrieval from setup.sh
2025-09-15 10:05:17 +02:00
Viktor Lofgren
4c6fdf6ebe
(language) Make language configuration configurable
2025-09-15 09:54:57 +02:00
Viktor Lofgren
554de21f68
(converter) Disable language keyword
2025-09-15 09:49:04 +02:00
Viktor Lofgren
00194acbfe
(search) Add language chooser to UI, clean up search service code
2025-09-13 12:40:42 +02:00