1
1
mirror of git://anongit.freedesktop.org/libreoffice/dictionaries synced 2025-10-05 16:13:05 +02:00

38 Commits

Author SHA1 Message Date
Stanislav Horacek
b3a1c0be50 Czech Hunspell: add several word forms
Change-Id: I227d4fe75539691c75323ffcc822545081ced9ae
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/119637
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-07-28 19:02:59 +02:00
Stanislav Horacek
01bb1d2c47 Czech Hunspell: fix declination of some nouns ending with "-ec"
changes in all three recent commits made by Miroslav Pošta

Change-Id: I29ff4b6147cf8ae000c95fd1072e472bb89edec5
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/118245
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-07-01 22:48:15 +02:00
Stanislav Horacek
c608cc40ec Czech Hunspell: fix several verbs of pattern "tisknout"
Change-Id: I6d31f9658e2797ff678e36b887f1e2f03689f1c4
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/118239
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-07-01 19:25:44 +02:00
Stanislav Horacek
bcf9aa0b95 Czech Hunspell: fix several nouns and adjectives
Change-Id: Ifda579bcc6aec9786355837b42a74ea9b2edc1c4
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/118235
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-07-01 19:12:48 +02:00
Stanislav Horacek
8cd38fb513 Czech Hunspell: fix word "třídička"
Change-Id: I36236d34249e9a8c5e8bfa48b30a26cf595d309b
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/117343
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-16 19:41:00 +02:00
Stanislav Horacek
29eb293030 Czech Hunspell: remove confusing surname
Change-Id: I5ac78be61cacdf05dd804240355ea828de722f24
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116626
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-02 22:31:54 +02:00
Stanislav Horacek
412f5dd223 Czech Hunspell: remove rule based on circumfix and make other updates
the rule does not produce correctly capitalized results
minor updates of word list
all changes suggested by Miroslav Pošta

Change-Id: I1ecfd7ea6daaf155228bd2405708fb097fa07997
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116281
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-02 22:26:57 +02:00
Stanislav Horacek
5ae65ee9e7 Czech Hunspell: use KEY and update TRY in .aff file
for TRY character frequencies were employed

Change-Id: I46a1680c10dd49af2f15984bafe686ff69757071
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116280
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-02 22:25:47 +02:00
Stanislav Horacek
27e7425c29 Czech Hunspell: remove duplicates
Change-Id: I90ca35378ad60768d6636193462fa307e9ac91a4
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116279
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-02 22:25:03 +02:00
Stanislav Horacek
ab4b942707 more updates of Czech Hunspell
done also by Miroslav Pošta

Change-Id: Ie08fb39f7252d1473bd27c55f90c1d765d1abf8e
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116278
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-02 22:24:21 +02:00
Stanislav Horacek
cd451382d7 update Czech Hunspell files
both word list and affixes improved significantly by Miroslav Pošta
details added to readme
this version corresponds with the version published
at translatoblog.cz

Change-Id: Ie9c137c3c652a29648177ed0d88a918acb2bbad8
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116277
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-06-02 22:22:45 +02:00
Stanislav Horacek
19a842695c Czech Hunspell: sort words in .dic file
Change-Id: I1e3fef34b64bd358d16880795acab6d9b71ad7dc
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116276
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-05-31 22:29:47 +02:00
Stanislav Horacek
48a66dd55f Czech Hunspell: use UTF8 encoding
Change-Id: I97e274fe4611cf1a2669e6c73950faed8f1f5bfd
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116275
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-05-31 22:23:39 +02:00
Stanislav Horacek
e7a4c67cdc restore more advanced Czech Hunspell dictionary
which used to be in LibreOffice till commit
b312442050

Change-Id: Idd856ff4caedb1d8e7b4f0c42231014bd956da7f
Reviewed-on: https://gerrit.libreoffice.org/c/dictionaries/+/116274
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2021-05-31 22:21:53 +02:00
Mattia Rizzolo
41c5fc0693 tdf#128341 use python3
Change-Id: Ic8deb039da037bd270c39da03f8697a9ab034ff0
Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>
Reviewed-on: https://gerrit.libreoffice.org/81410
Reviewed-by: Michael Stahl <michael.stahl@cib.de>
Tested-by: Michael Stahl <michael.stahl@cib.de>
2019-10-24 14:28:09 +02:00
Mattia Rizzolo
c15c2a8889 flake8 fixes to the dictionary-to-thesaurus script
tested to work with both python3 and 2.7.

Change-Id: I52fe00e1f33e605010cd99885c1a41396440e49d
Signed-off-by: Mattia Rizzolo <mattia@mapreri.org>
Reviewed-on: https://gerrit.libreoffice.org/81411
Reviewed-by: Thorsten Behrens <Thorsten.Behrens@CIB.de>
Reviewed-by: Michael Stahl <michael.stahl@cib.de>
Tested-by: Michael Stahl <michael.stahl@cib.de>
2019-10-24 11:26:09 +02:00
Stanislav Horacek
e2f020e6e2 remove incorrect Czech word
Change-Id: Ice9bf2b15284034ad9a21ef9123a761a7982ac9e
Reviewed-on: https://gerrit.libreoffice.org/78856
Reviewed-by: Andras Timar <andras.timar@collabora.com>
Tested-by: Andras Timar <andras.timar@collabora.com>
2019-09-16 06:45:26 +02:00
Stanislav Horacek
86921a78c4 update readme for Czech dictionaries
Change-Id: Iccc08a8535761418d5b8e01b642d5a18a5b7bedf
Reviewed-on: https://gerrit.libreoffice.org/62809
Reviewed-by: Adolfo Jayme Barrientos <fitojb@ubuntu.com>
Tested-by: Adolfo Jayme Barrientos <fitojb@ubuntu.com>
2018-11-04 01:26:19 +01:00
Gabor Kelemen
200eca2141 Make sure all dictionary descriptions are translated
Some dictionary extensions use 'en' as the language code of their
description.
The toolchain needs 'en-US' here to extract the extension name
as a translatable string.
As a result these names do not appear
localized in Pootle and then in the Extension Manager.

With this fix new pot files are generated in workdir/pot/dictionaries/

Change-Id: I9d35f2a028be15a77da3b0679cfd154afbf1dc60
Reviewed-on: https://gerrit.libreoffice.org/37868
Reviewed-by: Andras Timar <andras.timar@collabora.com>
Tested-by: Andras Timar <andras.timar@collabora.com>
2017-05-21 10:52:45 +02:00
Stanislav Horacek
3470ef8c07 Czech thesaurus: regenerate from updated source
Change-Id: I947d616e3e0b4184fe0f1679578149ce00b6d584
Reviewed-on: https://gerrit.libreoffice.org/30993
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2016-11-20 21:42:57 +00:00
Stanislav Horacek
23948938f7 Czech thesaurus: regenerate from updated source
changed location and authors of the source
sorting for source was applied -> thesaurus sorting also changed

Change-Id: I009688bb1aeaac20dbe0884f1b43b523a2a3eb7b
Reviewed-on: https://gerrit.libreoffice.org/30612
Reviewed-by: Jan Holesovsky <kendy@collabora.com>
Reviewed-by: Stanislav Horáček <stanislav.horacek@gmail.com>
Tested-by: Stanislav Horáček <stanislav.horacek@gmail.com>
2016-11-07 16:49:19 +00:00
Jan Holesovsky
e26e5fc152 Czech thesaurus: Blacklist more words + re-generate after the recent changes.
Change-Id: Ic56383e235be27d48358944c9b6588481052297a
2016-02-26 10:22:59 +01:00
Jan Holesovsky
e97c64ff3f dictionary-to-thesaurus.py: Put the better categorized words to the front.
Change-Id: Ib5c77f185abeeaef5045780766514a813794c8e8
2016-02-26 10:22:28 +01:00
Jan Holesovsky
c32de9bba6 dictionary-to-thesaurus.py: Only output the same class of word.
When the class of the word is unambiguous, limit the output only to that -
gives more precise & expected results.

[Like, it is interesting to see the other possibilities too, but I guess less
choices but more focused ones are preferred.]

Change-Id: I2876fbb4fa02c00fc7e65189812365f77b9a5ed6
2016-02-26 08:46:32 +01:00
Jan Holesovsky
8442e91f9d Czech thesaurus: Updates of some terms.
Change-Id: I13b60baf14fc90aba6f07ada2fc4423d06db76e8
2016-02-25 17:22:37 +01:00
Jan Holesovsky
f04d3f19b4 Czech thesaurus: Blacklist some unhelpful meanings.
Change-Id: I7d75626a37d4f241d8d407a11855325e39c5fa63
2016-02-25 14:41:59 +01:00
Jan Holesovsky
f83b25d29f dictionary-to-thesaurus.py: Move blacklist to a separate file.
Change-Id: Ie05e0c0ce8b4f9541a5a143ddf9ccf960940a3b7
2016-02-25 14:35:03 +01:00
Jan Holesovsky
bd5a09adea Related tdf#93514: Introduce a new Czech thesaurus.
This is a completely new, independent thesaurus, generated from an English <->
Czech dictinary.

The data of the dictionary are licensed under GNU Free Documentation License
1.1 or later, consequently this resulting thesaurus is GNU/FDL 1.1 or later
too.

Change-Id: I0136b413d5affd6e45a71bdd579ae196fe48dff5
2016-02-25 14:03:16 +01:00
Jan Holesovsky
68c702c5ed dictionary-to-thesaurus.py: Actually use the Czech names.
Change-Id: Ifb47efe7562ca9ccc2324d4ebd966506cae2bec6
2016-02-25 13:59:26 +01:00
Jan Holesovsky
46febeac1b dictionary-to-thesaurus.py: Various cleanups.
* word classifiacation (when available)
* word blacklist
* ignore some non-translations (eg. irregular verbs)
* ignore vulgarisms (when marked), they only add confusion
2016-02-25 11:18:30 +01:00
Jan Holesovsky
74e081219c dos2unix on the cs_CZ files. 2016-02-25 07:58:10 +01:00
Jan Holesovsky
f2e6dbb0a0 Czech: Script and dictionary to generate the Czech thesaurus.
slovnik_data_utf8.txt is the English <-> Czech dictionary from
http://slovnik.zcu.cz/download.php, licensed under
GNU Free Documentation License 1.1 or later.  The data are a snapshot
from 2016-02-24.

dictionary-to-thesaurus.py is a simple script that generates a thesaurus from
this dictionary.  The idea to generate our thesaurus from a dictionary comes
from Zdenek Zabokrtsky (UFAL, Faculty of Mathematics & Physics, Charles
University in Prague).

The results are far better than I would have imagined; I owe Zdenek some
beers :-)  Many thanks!

The source data are GNU/FDL 1.1 or later, the resulting thesaurus too.
The actual addition of the thesaurus to the build system will be done in a
separate commit later.
2016-02-25 07:20:52 +01:00
Christian Lohmaier
03a4a7b13f tdf#93514 remove Czech thesaurus
Change-Id: I40ebd1ca223fe7950ed3280c43a51a3dfbd0070e
2015-09-01 09:48:41 +02:00
Petr Gajdos
9b797809cb Fix errors reported by hunspell in Czech aff
Various incompatible stripping characters and conditions

Signed-off-by: Tomáš Chvátal <tchvatal@suse.cz>
2015-04-07 12:40:39 +02:00
David Tardon
b312442050 update Czech dictionary from liberix site
(http://www.liberix.cz/doplnky/slovniky/ooo/dict-cs-2.oxt)

This updates all files, not just thesaurus. Btw, the thesaurus is no
longer licensed under MIT.

Change-Id: I04e93c99aed8bc57b0b5724741842020271b69c2
2014-05-15 15:04:32 +02:00
David Tardon
b5b5c88c12 drop stray delzip files
Change-Id: I235d23248469b760da69983575dfcd73431757d4
2012-10-24 06:08:36 +02:00
Michael Stahl
c918f1d424 gbuild: let ExtensionTarget expect manifest below META-INF
... adapt dictionaries to that.
2012-10-23 20:05:55 +02:00
Norbert Thiebaud
a4473e06b5 move dictionaries structure one directory up
Change-Id: I70388bf6b95d8692cc6f25fc5a9c7baf3a675710
2012-10-16 11:09:27 -05:00