1
1
mirror of https://github.com/MarginaliaSearch/MarginaliaSearch.git synced 2025-10-05 21:22:39 +02:00

3505 Commits

Author SHA1 Message Date
Viktor Lofgren
acdf7b4785 (build) Add test-logger plugin to get better feedback during test execution 2025-08-29 10:41:35 +02:00
Viktor Lofgren
b5d27c1406 (search) Improve unicode support in displayTitle and displaySummary 2025-08-23 13:59:41 +02:00
Viktor Lofgren
55eb7dc116 (search) Improve unicode support in displayTitle and displaySummary 2025-08-23 13:57:51 +02:00
Viktor Lofgren
f0e8bc8baf (search) Improve unicode support in displayTitle and displaySummary 2025-08-23 13:56:19 +02:00
Viktor Lofgren
91a6ad2337 (search) Improve unicode support in displayTitle and displaySummary 2025-08-23 13:54:48 +02:00
Viktor Lofgren
9a182b9ddb (search) Use ADVERTISEMENT flag instead of TRACKING_ADVERTISEMENT when choosing to flag a result as having ads 2025-08-21 13:08:25 +02:00
Viktor Lofgren
fefbcf15ce (site) Make discord link point to chat.marginalia.nu and let nginx deal with figuring out which discord link to redirect to 2025-08-21 12:46:37 +02:00
Viktor Lofgren
9a789bf62d (array) Fix broken test 2025-08-18 09:10:58 +02:00
Viktor Lofgren
0525303b68 (index) Add upper limit to span lengths
Apparently outliers exist that are larger than SHORT_MAX.  This is probably not interesting, so we'll truncate at 8192 for now.

Adding logging statement to get more information about which spans these are so we can address the root cause down the line.
2025-08-17 08:44:57 +02:00
Viktor Lofgren
6953d65de5 (native) Register fixed fd:s for a nice io_uring speed boost 2025-08-16 13:48:11 +02:00
Viktor Lofgren
a7a18ced2e (native) Register fixed fd:s for a nice io_uring speed boost 2025-08-16 13:46:39 +02:00
Viktor Lofgren
7c94c941b2 (build) Correct rare scenario where root blocks could be generated with a negative size 2025-08-16 11:27:36 +02:00
Viktor Lofgren
ea99b62356 (build) Fix missing junit engine version 2025-08-16 11:01:32 +02:00
Viktor Lofgren
3dc21d34d8 (skiplist) Fix stability of getData fuzz test 2025-08-15 09:17:48 +02:00
Viktor Lofgren
51912e0176 (index) Tweak default values for IndexQueryExecution 2025-08-15 08:07:00 +02:00
Viktor Lofgren
de1b4d5372 (index) Make metrics make more sense by normalizing them by query budget 2025-08-15 03:16:22 +02:00
Viktor Lofgren
50ac926060 (index) Make metrics make more sense by normalizing them by query budget 2025-08-15 03:11:57 +02:00
Viktor Lofgren
d711ee75b5 (index) Add performance metrics 2025-08-15 00:48:52 +02:00
Viktor Lofgren
291ff0c4de (deps) Upgrade crawler commons to fix robots.txt-parser bug 2025-08-15 00:13:15 +02:00
Viktor
2fd2710355 Merge pull request #218 from MarginaliaSearch/o_direct_index
Replace document index btrees with a block based skiplist, get rid of mmap use O_DIRECT pread instead, use io_uring for positions reads
2025-08-14 23:57:09 +02:00
Viktor Lofgren
e3b957063d (native) Add fallbacks and configuration options for building on systems lacking liburing 2025-08-14 23:36:13 +02:00
Viktor Lofgren
aee262e5f6 (index) Safeguard against arena-leaks during exceptions
The GC would catch these eventually, but it's nice to clean up ourselves in a timely manner.
2025-08-14 19:28:31 +02:00
Viktor Lofgren
4a98a3c711 (skiplist) Move to a separate directory instead of in the btree module 2025-08-14 01:09:46 +02:00
Viktor Lofgren
68f52ca350 (test) Fix tests that works on my machine (TM) 2025-08-14 00:59:58 +02:00
Viktor Lofgren
2a2d951c2f (index) Fix unhinged default values for index.preparationThreads 2025-08-14 00:54:35 +02:00
Viktor Lofgren
379a1be074 (index) Add better timeout handling in UringQueue, fix slow memory leak on timeout exception 2025-08-14 00:52:50 +02:00
Viktor Lofgren
827aadafcd (uring) Reintroduce auto-slicing of excessively long read batches 2025-08-13 14:33:35 +02:00
Viktor Lofgren
aa7679d6ce (pool) Fix bug in exceptionally rare edge case leading to incorrect reads 2025-08-13 14:28:50 +02:00
Viktor Lofgren
6fe6de766d (pool) Fix SegmentMemoryPage storage 2025-08-13 13:17:14 +02:00
Viktor Lofgren
4245ac4c07 (doc) Update docs to reflect that we now need io_uring 2025-08-12 15:12:54 +02:00
Viktor Lofgren
1c49a0f5ad (index) Add system properties for toggling O_DIRECT mode for positions and spans 2025-08-12 15:11:13 +02:00
Viktor Lofgren
9a6e5f646d (docker) Add security_opt: seccomp:unconfined to docker-compose files
This is needed to access io_uring via docker.
2025-08-12 15:10:26 +02:00
Viktor Lofgren
fa92994a31 (uring) Fall back to simple I/O planning behavior when buffered mode is selected in UringFileReader 2025-08-11 23:44:38 +02:00
Viktor Lofgren
bc49406881 (build) Compatibility hack debian server 2025-08-11 23:26:53 +02:00
Viktor Lofgren
90325be447 (minor) Fix comments 2025-08-11 23:19:53 +02:00
Viktor Lofgren
dc89587af3 (index) Improve disk locality of the positions data 2025-08-11 21:17:12 +02:00
Viktor Lofgren
7b552afd6b (index) Improve disk locality of the positions data 2025-08-11 20:59:11 +02:00
Viktor Lofgren
73557edc67 (index) Improve disk locality of the positions data 2025-08-11 20:57:32 +02:00
Viktor Lofgren
83919e448a (index) Use O_DIRECT buffered reads for spans 2025-08-11 18:04:25 +02:00
Viktor Lofgren
6f5b75b84d (cleanup) Remove accidentally committed print stmt 2025-08-11 18:04:25 +02:00
Viktor Lofgren
db315e2813 (index) Use O_DIRECT position reads 2025-08-11 18:04:25 +02:00
Viktor Lofgren
e9977e08b7 (index) Block-align positions data
This will make reads more efficient, and possibly pave way for O_DIRECT reads of this data
2025-08-11 14:36:45 +02:00
Viktor Lofgren
1df3757e5f (native) Clean up io_uring code and check in execution queue, currently unused but nifty 2025-08-11 13:54:05 +02:00
Viktor Lofgren
ca283f9684 (native) Clean up native helpers and break them into their own library 2025-08-10 20:55:34 +02:00
Viktor Lofgren
85360e61b2 (index) Grow span writer buffer size
Apparently outlier spans can grow considerably large.
2025-08-10 17:20:38 +02:00
Viktor Lofgren
e2ccff21bc (index) Wait until ranking is finished in query execution 2025-08-09 23:40:30 +02:00
Viktor Lofgren
c5b5b0c699 (index) Permit fast termination of rejection filter execution 2025-08-09 23:36:59 +02:00
Viktor Lofgren
9a65946e22 (uring) Reduce queue size to 2048 to avoid ENOMEM on systems with default ulimits 2025-08-09 20:41:24 +02:00
Viktor Lofgren
1d2ab21e27 (index) Aggregate termdata reads into a single io_uring operation instead of one for each term 2025-08-09 17:43:18 +02:00
Viktor Lofgren
0610cc19ad (index) Fix double close errors 2025-08-09 17:05:38 +02:00