Adding suffix rule in Hunspell for Greek
Thread poster: Spiros Doikas

Spiros Doikas  Identity Verified
Local time: 05:10
Member (2002)
English to Greek
+ ...
Dec 2, 2014

Hunspell appears to miss some rules for Greek resulting in flagging words ending in -ευτείτε as errors (i.e. ερωτευτείτε, εκμεταλευτείτε).

I tried adding rules in el_GR.aff file like:

SFX R εύομαι ευτείτε
Or
SFX Z ομαι ευτείτε εύομαι

but had no luck. Also tried to find Hunspell people (http://elspell.math.upatras.gr/?section=oofficespell&subsection=feedback) but e-mail bounces.


Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 06:10
Member (2006)
English to Russian
+ ...
More info Dec 2, 2014

1. Why asking in the Trados forum?
2. Greek is not among world's most spoken languages. Provide more details on the words in question, in particular their dictionary forms.
3. The first rule looks wrong.
4. As for the second, did you assign the respective flag to any word? Did you unmunch the dictionary? Does the required word form appear in the unmunched list?
5. Share the files, do not copy and paste text here, something important may be lost.


Direct link Reply with quote
 

Spiros Doikas  Identity Verified
Local time: 05:10
Member (2002)
English to Greek
+ ...
TOPIC STARTER
Because I use it through Trados Dec 2, 2014

1. Why asking in the Trados forum?
Because I use it through Trados

2. Greek is not among world's most spoken languages. Provide more details on the words in question, in particular their dictionary forms.

Forms listed in full dictionary:

ερωτεύομαι
ερωτευόμασταν
ερωτευόμαστε
ερωτευόμουν
ερωτεύονται
ερωτεύονταν
ερωτευόντουσαν
ερωτευόσασταν
ερωτευόσαστε
ερωτευόσουν
ερωτευόταν
ερωτευτεί
ερωτεύτηκαν
ερωτεύτηκε


4. As for the second, did you assign the respective flag to any word? Did you unmunch the dictionary? Does the required word form appear in the unmunched list?
The full word with the suffix form does not appear in full list in the dictionary. Different forms of that word do appear as seen above. Words with similar affixes appear in the full list, with the full affix, i.e. εκμεταλλευτείτε. No flags are used in word list.

5. Share the files, do not copy and paste text here, something important may be lost
http://paxos.tk/huns.rar


Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 06:10
Member (2006)
English to Russian
+ ...
Now, it’s clear Dec 2, 2014

OK, the problem is that the dictionary provided by you does not use affixes for declensions. To make advantage of it, you should:

1. Determine the initial form. Unfortunately, in your example it’s not clear. Let it be:
Code:
ερωτεύομαι


2. Determine the part that remains unchanged in any word form. I guess, it’s:
Code:
ερωτε


3. Determine the part to drop. It’s:
Code:
ύομαι


4. Determine the part to add. It’s, as you say:
Code:
υτείτε


5. Now, we’re ready to make a rule. It’s:
Code:
SFX R Y 1
SFX R ύομαι υτείτε ύομαι


In this rule, the first line, the header, means the following:
SFX = suffix (i. e., a part at the end of the word);
R = its identifier;
Y = this suffix may be combined with other affixes (I don’t know if it’s true, but it’s generally safe to put Y unless you’re sure otherwise);
1 = the line count except for the header.
The second line:
SFX R = well, it’s clear, I guess;
ύομαι for the first time = the part to drop;
υτείτε = the part to add;
ύομαι for the second time = this line (as such lines may be multiple for a rule) applies only to words ending with ύομαι.
6. Now, add the above rule to the aff file.
7. In the dic file, change
Code:
ερωτεύομαι


to
Code:
ερωτεύομαι/R



I’ve created a testcase with the above rule and the above word. Unmunching produces:
Code:
ερωτεύομαι
ερωτευτείτε



Looks right, is it?

[Edited at 2014-12-02 17:27 GMT]


Direct link Reply with quote
 

Spiros Doikas  Identity Verified
Local time: 05:10
Member (2002)
English to Greek
+ ...
TOPIC STARTER
Thanks Dec 2, 2014

It is interesting that although the dictionary does not use affixes for declensions there is an .aff file with affixes... I wonder how these interact with the dictionary since the actual dictionary entries are not marked somehow.

Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 06:10
Member (2006)
English to Russian
+ ...
A must Dec 2, 2014

An aff file is a must. It may be empty, containing only an encoding declaration, but its absence will cause failure of spellcheck (well, in the apps that I know, not sure about Trados).

[Edited at 2014-12-02 19:31 GMT]


Direct link Reply with quote
 

Spiros Doikas  Identity Verified
Local time: 05:10
Member (2002)
English to Greek
+ ...
TOPIC STARTER
I see Dec 3, 2014

So in this case the file, although it has entries, serves of no practical purpose?

By the way which tool do you use to unmunch the dictionary?


Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 06:10
Member (2006)
English to Russian
+ ...
Indeed Dec 4, 2014

Yes, in this particular case, the affix file seems to be virtually useless. I vaguely remember that some ten years ago or so MySpell (Hunspell’s predecessor) was reported to have severe problems with the Greek script, and the dictionary was built as a mere list of all inflected forms. Most probably, the problem does not exist anymore, but nobody has taken care to revamp the dictionary.

To unmunch the dictionary, I use the unmunch command from the Hunspell package:
Generate all word forms using Lucene & Hunspell.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Adding suffix rule in Hunspell for Greek

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search