1
1
mirror of https://github.com/MarginaliaSearch/MarginaliaSearch.git synced 2025-10-05 21:22:39 +02:00

Commit Graph

  • acb9ec7b15 (refac) Consistently use 'languageIsoCode' for the language field Viktor Lofgren 2025-09-03 12:38:42 +02:00
  • 47079e05db (index) Store language information in the index journal Viktor Lofgren 2025-09-03 12:33:24 +02:00
  • c93056e77f (refac) Clean up index code Viktor Lofgren 2025-09-03 09:51:57 +02:00
  • 6f7530e807 (refac) Clean up index code Viktor Lofgren 2025-09-02 18:52:31 +02:00
  • 87ce4a1b52 (refac) Clean up index code Viktor Lofgren 2025-09-02 17:52:38 +02:00
  • 52194cbe7a (refac) Clean up index code Viktor Lofgren 2025-09-02 17:44:42 +02:00
  • fd1ac03c78 (refac) Clean up index code Viktor Lofgren 2025-09-02 17:30:19 +02:00
  • 5e5b86efb4 (refac) Clean up index code Viktor Lofgren 2025-09-02 17:24:30 +02:00
  • f332ec6191 (refac) Clean up index code Viktor Lofgren 2025-09-02 13:13:10 +02:00
  • c25c1af437 (refac) Clean up index code Viktor Lofgren 2025-09-02 13:04:05 +02:00
  • eb0c911b45 (refac) Clean up index code Viktor Lofgren 2025-09-02 12:50:07 +02:00
  • 1979870ce4 (refac) Merge index-forward, index-reverse, index/query into index Viktor Lofgren 2025-09-02 12:30:42 +02:00
  • 0ba2ea38e1 (index) Move reverse index into a distinct package Viktor Lofgren 2025-09-02 11:58:57 +02:00
  • d6cfbceeea (index) Use a configurable hasher in the index Viktor Lofgren 2025-09-01 13:44:28 +02:00
  • e369d200cc (refac) Simplify index data model by merging SearchParameters, SearchTerms and ResultRankingContext into a new object called SearchContext Viktor Lofgren 2025-09-01 13:11:31 +02:00
  • 946d64c8da (index) Make hash algorithm selection configurable, writer-side Viktor Lofgren 2025-09-01 12:03:01 +02:00
  • 42f043a60f (API) Add language parameter to the APIs Viktor Lofgren 2025-09-01 09:33:39 +02:00
  • b46f2e1407 (sideload) Remove upper limit on XML entities Viktor Lofgren 2025-08-31 14:14:09 +02:00
  • 18aa1b9764 (zim) Fix parsing of modern wikipedia zim files Viktor Lofgren 2025-08-31 12:52:44 +02:00
  • 2f3950e0d5 (language) Roll KeywordExtractor into LanguageDefinition Viktor Lofgren 2025-08-29 10:30:31 +02:00
  • 61d803869e (language) Add support for languages with no POS-tagging Viktor Lofgren 2025-08-29 10:18:07 +02:00
  • df6434d177 (language) Add support for languages with no POS-tagging Viktor Lofgren 2025-08-29 09:48:52 +02:00
  • 59519ed7c4 (language) Adjust languages.xml Viktor Lofgren 2025-08-27 12:56:10 +02:00
  • 874fc2d250 (language) Remove debug logging junk Viktor Lofgren 2025-08-27 12:44:48 +02:00
  • 69e8ec0eef (language) Fix subject keywords matcher with better rules and correct logic Viktor Lofgren 2025-08-27 12:39:37 +02:00
  • a7eb5f54e6 (language) Clean up PosPattern, add tests Viktor Lofgren 2025-08-27 10:37:46 +02:00
  • b29ba3e228 (language) Integrate new configurable POS patterns with keyword matchers Viktor Lofgren 2025-08-26 11:31:49 +02:00
  • 5fa5029c60 (language) Clean up UI Viktor Lofgren 2025-08-25 09:54:27 +02:00
  • 4257f60f00 (keywords) Fix logic error causing misidentification of some keywords Viktor Lofgren 2025-08-25 09:50:28 +02:00
  • ce221d3a0e (language) Integrate old keyword extraction logic with new test tool Viktor Lofgren 2025-08-25 09:50:05 +02:00
  • f0741142a3 (refac) Move keyword extraction into language processing Viktor Lofgren 2025-08-25 09:05:31 +02:00
  • 0899e4d895 (language) First version of the language processing debug tool Viktor Lofgren 2025-08-24 16:36:16 +02:00
  • bbf7c5a1cb (language) Fix RDRPosTagger back to working order and integrate with SentenceExtractor Viktor Lofgren 2025-08-24 16:35:58 +02:00
  • 686a40e69b (language) Update modelling Viktor Lofgren 2025-08-24 10:55:42 +02:00
  • 8af254f44f (language) Parse PosPattern tags Viktor Lofgren 2025-08-22 09:57:34 +02:00
  • 2c21bd9287 (language) Add logging for unknown POS tags in PosPattern Viktor Lofgren 2025-08-21 13:38:31 +02:00
  • f9645e2f00 (language) Enhance PosPattern to support wildcard variants in pattern matching Viktor Lofgren 2025-08-21 13:33:32 +02:00
  • 81e311b558 (language) POS-patterns WIP Viktor Lofgren 2025-08-21 12:45:14 +02:00
  • 507c09146a (language) Add support for downloadable resources, parsing POS tag configuration tags Viktor Lofgren 2025-08-20 12:09:27 +02:00
  • f682425594 (language) Basic test for LanguageConfiguration Viktor Lofgren 2025-08-19 15:55:54 +02:00
  • de67006c4f (language) Initial integration of new language configuration utility Viktor Lofgren 2025-08-19 15:41:42 +02:00
  • eea32bb7b4 (language) Very basic language.xml loading off classpath Viktor Lofgren 2025-08-19 09:55:50 +02:00
  • e976940a4e (config) Move slf4j config files to common:config Viktor Lofgren 2025-08-19 09:55:32 +02:00
  • b564b33028 (language) Initial embryo for language configuration Viktor Lofgren 2025-08-19 09:29:09 +02:00
  • 1cca16a58e (language) Simplify language filters Viktor Lofgren 2025-08-19 09:03:57 +02:00
  • 70b4ed6d81 (ldb) Pipe language information into LDB database Viktor Lofgren 2025-08-18 12:21:55 +02:00
  • 45dc6412c1 (converter) Add language column to slop tables Viktor Lofgren 2025-08-17 10:22:14 +02:00
  • b3b95edcb5 (converter) Bypass some of the grammar processing in the keyword extraction depending on language selection Viktor Lofgren 2025-08-17 09:55:29 +02:00
  • 338d300e1a (converter) Clean up spans-handling Viktor Lofgren 2025-08-17 09:41:53 +02:00
  • fa685bf1f4 (converter) Add Language field to ProcessedDocumentDetails Viktor Lofgren 2025-08-17 09:23:32 +02:00
  • d79a3e2b2a (converter) Tag documents by language in the index as a keyword Viktor Lofgren 2025-08-16 16:38:11 +02:00
  • 854382b2be (language-filter) Experimentally permit Swedish results to pass through the language filter Viktor Lofgren 2025-08-16 16:26:30 +02:00
  • 8710adbc2a (build) Reduce log noise during tests Viktor Lofgren 2025-08-29 10:55:32 +02:00
  • acdf7b4785 (build) Add test-logger plugin to get better feedback during test execution Viktor Lofgren 2025-08-29 10:41:35 +02:00
  • b5d27c1406 (search) Improve unicode support in displayTitle and displaySummary Viktor Lofgren 2025-08-23 13:59:41 +02:00
  • 55eb7dc116 (search) Improve unicode support in displayTitle and displaySummary Viktor Lofgren 2025-08-23 13:57:51 +02:00
  • f0e8bc8baf (search) Improve unicode support in displayTitle and displaySummary Viktor Lofgren 2025-08-23 13:56:19 +02:00
  • 91a6ad2337 (search) Improve unicode support in displayTitle and displaySummary Viktor Lofgren 2025-08-23 13:52:19 +02:00
  • 9a182b9ddb (search) Use ADVERTISEMENT flag instead of TRACKING_ADVERTISEMENT when choosing to flag a result as having ads Viktor Lofgren 2025-08-21 13:08:25 +02:00
  • fefbcf15ce (site) Make discord link point to chat.marginalia.nu and let nginx deal with figuring out which discord link to redirect to Viktor Lofgren 2025-08-21 12:46:37 +02:00
  • 9a789bf62d (array) Fix broken test Viktor Lofgren 2025-08-18 09:10:58 +02:00
  • 0525303b68 (index) Add upper limit to span lengths Viktor Lofgren 2025-08-17 08:44:57 +02:00
  • 6953d65de5 (native) Register fixed fd:s for a nice io_uring speed boost Viktor Lofgren 2025-08-16 13:48:11 +02:00
  • a7a18ced2e (native) Register fixed fd:s for a nice io_uring speed boost Viktor Lofgren 2025-08-16 13:46:39 +02:00
  • 7c94c941b2 (build) Correct rare scenario where root blocks could be generated with a negative size Viktor Lofgren 2025-08-16 11:27:15 +02:00
  • ea99b62356 (build) Fix missing junit engine version Viktor Lofgren 2025-08-16 11:01:32 +02:00
  • 3dc21d34d8 (skiplist) Fix stability of getData fuzz test Viktor Lofgren 2025-08-15 09:17:48 +02:00
  • 51912e0176 (index) Tweak default values for IndexQueryExecution Viktor Lofgren 2025-08-15 08:07:00 +02:00
  • de1b4d5372 (index) Make metrics make more sense by normalizing them by query budget Viktor Lofgren 2025-08-15 03:16:22 +02:00
  • 50ac926060 (index) Make metrics make more sense by normalizing them by query budget Viktor Lofgren 2025-08-15 03:11:57 +02:00
  • d711ee75b5 (index) Add performance metrics Viktor Lofgren 2025-08-15 00:48:52 +02:00
  • 291ff0c4de (deps) Upgrade crawler commons to fix robots.txt-parser bug Viktor Lofgren 2025-08-15 00:11:44 +02:00
  • 2fd2710355 Merge pull request #218 from MarginaliaSearch/o_direct_index Viktor 2025-08-14 23:57:09 +02:00
  • e3b957063d (native) Add fallbacks and configuration options for building on systems lacking liburing Viktor Lofgren 2025-08-14 23:36:13 +02:00
  • aee262e5f6 (index) Safeguard against arena-leaks during exceptions Viktor Lofgren 2025-08-14 19:28:31 +02:00
  • 4a98a3c711 (skiplist) Move to a separate directory instead of in the btree module Viktor Lofgren 2025-08-14 01:09:46 +02:00
  • 68f52ca350 (test) Fix tests that works on my machine (TM) Viktor Lofgren 2025-08-14 00:59:58 +02:00
  • 2a2d951c2f (index) Fix unhinged default values for index.preparationThreads Viktor Lofgren 2025-08-14 00:54:35 +02:00
  • 379a1be074 (index) Add better timeout handling in UringQueue, fix slow memory leak on timeout exception Viktor Lofgren 2025-08-14 00:48:44 +02:00
  • 827aadafcd (uring) Reintroduce auto-slicing of excessively long read batches Viktor Lofgren 2025-08-13 14:33:35 +02:00
  • aa7679d6ce (pool) Fix bug in exceptionally rare edge case leading to incorrect reads Viktor Lofgren 2025-08-13 14:28:50 +02:00
  • 6fe6de766d (pool) Fix SegmentMemoryPage storage Viktor Lofgren 2025-08-13 13:17:14 +02:00
  • 4245ac4c07 (doc) Update docs to reflect that we now need io_uring Viktor Lofgren 2025-08-12 15:12:54 +02:00
  • 1c49a0f5ad (index) Add system properties for toggling O_DIRECT mode for positions and spans Viktor Lofgren 2025-08-12 15:11:13 +02:00
  • 9a6e5f646d (docker) Add security_opt: seccomp:unconfined to docker-compose files Viktor Lofgren 2025-08-12 15:10:26 +02:00
  • fa92994a31 (uring) Fall back to simple I/O planning behavior when buffered mode is selected in UringFileReader Viktor Lofgren 2025-08-11 23:44:38 +02:00
  • bc49406881 (build) Compatibility hack debian server Viktor Lofgren 2025-08-11 23:26:53 +02:00
  • 90325be447 (minor) Fix comments Viktor Lofgren 2025-08-11 23:19:53 +02:00
  • dc89587af3 (index) Improve disk locality of the positions data Viktor Lofgren 2025-08-11 21:17:12 +02:00
  • 7b552afd6b (index) Improve disk locality of the positions data Viktor Lofgren 2025-08-11 20:59:11 +02:00
  • 73557edc67 (index) Improve disk locality of the positions data Viktor Lofgren 2025-08-11 20:57:32 +02:00
  • 83919e448a (index) Use O_DIRECT buffered reads for spans Viktor Lofgren 2025-08-11 17:56:42 +02:00
  • 6f5b75b84d (cleanup) Remove accidentally committed print stmt Viktor Lofgren 2025-08-11 17:44:09 +02:00
  • db315e2813 (index) Use O_DIRECT position reads Viktor Lofgren 2025-08-11 16:20:47 +02:00
  • e9977e08b7 (index) Block-align positions data Viktor Lofgren 2025-08-11 14:14:07 +02:00
  • 1df3757e5f (native) Clean up io_uring code and check in execution queue, currently unused but nifty Viktor Lofgren 2025-08-11 13:54:05 +02:00
  • ca283f9684 (native) Clean up native helpers and break them into their own library Viktor Lofgren 2025-08-10 20:55:27 +02:00
  • 85360e61b2 (index) Grow span writer buffer size Viktor Lofgren 2025-08-10 17:20:38 +02:00
  • e2ccff21bc (index) Wait until ranking is finished in query execution Viktor Lofgren 2025-08-09 23:37:16 +02:00
  • c5b5b0c699 (index) Permit fast termination of rejection filter execution Viktor Lofgren 2025-08-09 23:36:59 +02:00