Tags instead of commas in Autosuggest dictionaries
Thread poster: Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 07:07
Member (2009)
English to Polish
+ ...
Jan 13, 2010

I've noticed that my English-Polish autosuggest dictionary contains {{PCT}} tags where commas should be. Needless to say, that is not helping, because the tag is inserted as is. Has anyone else noticed that? There are also other tags like {{NUM}}, but those are less common and less annoying.

BTW. The dictionaries are in SQLite format and can be viewed and edited externally. However, I haven't had time to find a good free (-ish) application that would allow me to do that easily and with no corruption. Has anyone had any success with that?

My setup: SDL Trados Studio 2009 Freelance Plus, Windows Vista, Polish local settings


Direct link Reply with quote
 
Stefan de Boeck  Identity Verified
Belgium
Local time: 07:07
English to Dutch
+ ...
decimal points? Jan 13, 2010

Adam Łobatiuk wrote:
I've noticed that my English-Polish autosuggest dictionary contains {{PCT}} tags where commas should be. (...) There are also other tags like {{NUM}} ...

Any chance these commas used to be decimal points?


Direct link Reply with quote
 
Adam Łobatiuk  Identity Verified
Poland
Local time: 07:07
Member (2009)
English to Polish
+ ...
TOPIC STARTER
Just regular punctuation Jan 13, 2010

Stefan de Boeck wrote:
Any chance these commas used to be decimal points?


No, they were regular "linguistic" commas in frequently used phrases from a TM for a user guide. I even opened the Studio TM in a text editor and searched for those phrases - they all have regular commas. I wonder if it has something to do with the database separators - maybe commas are replaced because they mess with the DB structure?

This phenomenon might not be very common, because commas are often used to separate phrases, so they don't appear in Autosuggest strings that much. But some punctuation rules require commas before certain words, and that's why there are commas (as tags) in my Polish strings.


Direct link Reply with quote
 
Stefan de Boeck  Identity Verified
Belgium
Local time: 07:07
English to Dutch
+ ...
why and how Jan 13, 2010

Adam Łobatiuk wrote:
..., and that's why there are commas (as tags) in my Polish strings.

So now you've established the why of this, say, situation -- any progress on how to get out of it?


Direct link Reply with quote
 
Adam Łobatiuk  Identity Verified
Poland
Local time: 07:07
Member (2009)
English to Polish
+ ...
TOPIC STARTER
Still researching Jan 13, 2010

In the second part of my original post, I asked if anyone had successfully edited an Autosuggest dictionary with an SQLite editor/converter. For example, I can open my dictionary neatly in the free SQLite Database Browser, but it doesn't have a search and replace feature. You can export and import tables, but the extended characters get corrupt. If there were a free and easy to use tool, I could fix the dictionary myself. I could also use a more difficult one, but that would take a bit more time.

Of course, it would be great if SDL took care of that, but no one else seems to have this problem.


Direct link Reply with quote
 
Stefan de Boeck  Identity Verified
Belgium
Local time: 07:07
English to Dutch
+ ...
Captain Kirk Jan 13, 2010

Adam Łobatiuk wrote:
... no one else seems to have this problem.

That's only because no one else is trying, Adam.

Good luck, and Godspeed!


Direct link Reply with quote
 

Emma Goldsmith  Identity Verified
Spain
Local time: 07:07
Member (2010)
Spanish to English
I get the problem too Jan 13, 2010

Just to let you know that you're not alone, Adam.
I haven't had the problem for sometime - possibly since I last regenerated my Autosuggest dictionary - but it's definitely something I've seen (a PCT tags replacing a comma, making that particular autosuggest entry unusable).

I thought about deleting the segments in my TM which were giving the problem as a possible solution and then regenerating the AS dict. But that depends on how useful those particular segments are.

I wonder if anyone else has had this problem?


Direct link Reply with quote
 
Adam Łobatiuk  Identity Verified
Poland
Local time: 07:07
Member (2009)
English to Polish
+ ...
TOPIC STARTER
Thanks, Emma! Jan 13, 2010

It doesn't really make me happy to hear you have this problem too, but it helps to know I'm not alone.

I don't think you should delete those segments - in my TM they look perfectly normal, and those phrases would be missing from the suggestion lists.

I'll let you know if I find a reasonably easy way to edit the dictionaries, and hopefully someone has found one already


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 07:07
English
Some more detail Jan 14, 2010

Hi all,

{{PCT}} stands for punctuation, and {{NUM}} stands for number. During generation of the ASDs such constructs (as well as others, such as dates, times, ...) are represented as placeholders ("{{...}}"), not literally. However, they should never occur in a suggested phrase.

So, this may be an issue as it may affect the ASD generation. For the time being (i.e. if it occurs more frequently), we may need to tune the ASD lookup processor to never suggest phrases which contain such placeholders (even though they may be in the ASD).

We thought we had tuned the extraction algorithm so that such phrases would never be in the phrase dictionary, but if you are still seeing these issues with SP1 installed and with a newly generated ASD then we will have to take another look. If you are ok with it I would be happy to contact you off forum and look at your particular situation?

ASDs should never be edited, it is very likely that recall will be reduced and/or errors may be introduced. Also, we may change the storage format (i.e. the database schema and/or contents) at any time without notice. The internal representation and storage format of ASDs is not part of the API, not documented for external users, and subject to change (the same is the case for the internal TM storage format) so encouraging users to make their own changes would make support an impossible task.

Regards

Paul


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 07:07
English
An update Jan 14, 2010

Hi,

Emma kindly sent us a screenshot of this problem reoccurring and we think we know why. Our developer looked at the ASD generator and thinks that {{PCT}} and {{NUM}} should not appear at the beginning or the end of a suggested phrase, but in the screenshot Emma provided it seems they are in the middle of a phrase. So we'd need to instruct the ASD generator to never generate phrases which span across such tokens.

If you have any experience of this happenning for instances other than inside a phrase as oppose to the start or end then I would be interested to work on these with you. Otherwise, we will work on a solution for this problem that we can release as quickly as possible.

Regards

Paul


Direct link Reply with quote
 
Adam Łobatiuk  Identity Verified
Poland
Local time: 07:07
Member (2009)
English to Polish
+ ...
TOPIC STARTER
Thanks Paul and Emma! Jan 14, 2010

I can confirm that the tags appear in the middle of phrases. For example, in phrases like "Find out how", which requires a comma before the Polish word for "how". I would actually like to have such phrases in their entirety. Is it not possible to keep the commas as simply commas?

Direct link Reply with quote
 
BaskaS
Local time: 07:07
English to Polish
+ ...
I have the same problem... Jan 26, 2010

I also get this problem {{PCT}} or {{NUM}} inside suggested expressions. And I would also like to know if there is any way to edit the Autosuggest Dictionary in order to delete some annoying phrases which I would never use.

Today I also discovered that {{PCT}} appears inside suggested expressions instead of apostrophe - e.g. Company's Management Board shows as Company {{PCT}} s Management Board.


[Zmieniono 2010-01-27 17:44 GMT]


Direct link Reply with quote
 

Joel Earnest
Local time: 07:07
Swedish to English
Me too! Jan 26, 2010

Actually, I just noticed it for the first time today. Don't recall if it was at the beginning or end of the phrase.

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Tags instead of commas in Autosuggest dictionaries

Advanced search







LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search