Best way to create a termbase from OCR input?
Thread poster: fvg
fvg
Local time: 00:21
German to English
+ ...
Jan 26, 2003

Dear colleagues,

I think I posted this question to the wrong list yesterday. The question really is:

Which of the CAT progs is best suited for creating a termbase from OCR input?



For my field, which is translating German philosophy into English, the standard dictionary is a print version. I\'m thinking of scanning it, and then getting it into a termbase format. (I\'ve done a lot of OCR, so I don\'t anticipate problems in identifying the fields)

But I\'m a beginner with the CAT progs, and I\'d be most grateful for information on the following:

which of the CAT progs would be best suited for generating such a termbase? (I haven\'t bought one yet, since the availability of my most important dictionary is really what it all hinges on.)



Ideas anyone? I\'d be most grateful.



regards,



Dr. Frederik van Gelder


[addsig]


Direct link Reply with quote
 

Marianne Renia  Identity Verified
Belgium
Local time: 00:21
German to Dutch
+ ...
Have a look at MultiTrans Jan 27, 2003

Hi Frederik,



Maybe MultiTrans is a good tool for what you want. I came across it last week on http://www.multicorpora.com/index_e.html. The tool works with copora and you can recycle old translations, web content and any kind of reference material. There is a 30-days trial version. For myself I\'m still not sure whether to buy this tool or DV

Best regards,

Marianne Renia



Direct link Reply with quote
 
fvg
Local time: 00:21
German to English
+ ...
TOPIC STARTER
Thanks, I'll try it out Jan 27, 2003

Hi Marianne, (leuk, nog een Nederlander!)

thanks for the advice. I\'ll try it out. For me the issue is: which is the CAT prog of choice when it comes to creating one\'s own - large! - termbase. There are a lot of specialist dictionaries out there which exist only in the print version, so there must be quite a lot of people interested in getting them into some kind of manageable form on a PC.



groetjes,



Frederik van Gelder



Quote:


Maybe MultiTrans is a good tool for what you want. I came across it last week on http://www.multicorpora.com/index_e.html.

Best regards,

Marianne Renia


[addsig]

Direct link Reply with quote
 

sylver  Identity Verified
Local time: 07:21
English to French
All,...but Jan 29, 2003

Quote:


On 2003-01-26 11:50, fvg wrote:

Which of the CAT progs is best suited for creating a termbase from OCR input?

(...)I\'m thinking of scanning it, and then getting it into a termbase format. (I\'ve done a lot of OCR, so I don\'t anticipate problems in identifying the fields)

But I\'m a beginner with the CAT progs, and I\'d be most grateful for information on the following:

which of the CAT progs would be best suited for generating such a termbase? (I haven\'t bought one yet, since the availability of my most important dictionary is really what it all hinges on.)



Ideas anyone? I\'d be most grateful.



regards,

Dr. Frederik van Gelder





Wordfast will accept any 3 columns tab delimited text file or xls as glossary. So if you work with Wordfast, realize that, by the time you have a table, you have a fully searchable glossary as well.



If you end up working with Trados, know that you can create a multiterm file by importing the table you get.



All major CATs (DV, Wordfast, Trados, SDLX, Transit ...) have some sort of a termbase, and all of these can import tab/comma delimited text - which is what you get after OCR, I guess.



To my knowledge, the easiest way to work with and to handle terminology in is Wordfast. Totally open formats, and you can use your glossaries to check your translations in real time.



But in all cases, all major CATs can use your OCR output - just a matter of how much manipulations you will have to do to fit it in. (Trados Multiterm import function is a pain, imo).



Beside CATs, there are a number of dictionary applications out there, and most will handle tab delimited text as well.



Really, the point is more to get a feel of the different existing CATs, and see which one you are most comfortable with. They can all do the stuff you need.



Hope this helps

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Best way to create a termbase from OCR input?

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search