Mobile menu

Best way to create a termbase from OCR input?
Thread poster: fvg
fvg
Local time: 00:43
German to English
+ ...
Jan 26, 2003

Dear colleagues,

I think I posted this question to the wrong list yesterday. The question really is:

Which of the CAT progs is best suited for creating a termbase from OCR input?



For my field, which is translating German philosophy into English, the standard dictionary is a print version. I\'m thinking of scanning it, and then getting it into a termbase format. (I\'ve done a lot of OCR, so I don\'t anticipate problems in identifying the fields)

But I\'m a beginner with the CAT progs, and I\'d be most grateful for information on the following:

which of the CAT progs would be best suited for generating such a termbase? (I haven\'t bought one yet, since the availability of my most important dictionary is really what it all hinges on.)



Ideas anyone? I\'d be most grateful.



regards,



Dr. Frederik van Gelder


[addsig]


Direct link Reply with quote
 

Marianne Renia  Identity Verified
Belgium
Local time: 00:43
German to Dutch
+ ...
Have a look at MultiTrans Jan 27, 2003

Hi Frederik,



Maybe MultiTrans is a good tool for what you want. I came across it last week on http://www.multicorpora.com/index_e.html. The tool works with copora and you can recycle old translations, web content and any kind of reference material. There is a 30-days trial version. For myself I\'m still not sure whether to buy this tool or DV

Best regards,

Marianne Renia



Direct link Reply with quote
 
fvg
Local time: 00:43
German to English
+ ...
TOPIC STARTER
Thanks, I'll try it out Jan 27, 2003

Hi Marianne, (leuk, nog een Nederlander!)

thanks for the advice. I\'ll try it out. For me the issue is: which is the CAT prog of choice when it comes to creating one\'s own - large! - termbase. There are a lot of specialist dictionaries out there which exist only in the print version, so there must be quite a lot of people interested in getting them into some kind of manageable form on a PC.



groetjes,



Frederik van Gelder



Quote:


Maybe MultiTrans is a good tool for what you want. I came across it last week on http://www.multicorpora.com/index_e.html.

Best regards,

Marianne Renia


[addsig]

Direct link Reply with quote
 

sylver  Identity Verified
Local time: 06:43
English to French
All,...but Jan 29, 2003

Quote:


On 2003-01-26 11:50, fvg wrote:

Which of the CAT progs is best suited for creating a termbase from OCR input?

(...)I\'m thinking of scanning it, and then getting it into a termbase format. (I\'ve done a lot of OCR, so I don\'t anticipate problems in identifying the fields)

But I\'m a beginner with the CAT progs, and I\'d be most grateful for information on the following:

which of the CAT progs would be best suited for generating such a termbase? (I haven\'t bought one yet, since the availability of my most important dictionary is really what it all hinges on.)



Ideas anyone? I\'d be most grateful.



regards,

Dr. Frederik van Gelder





Wordfast will accept any 3 columns tab delimited text file or xls as glossary. So if you work with Wordfast, realize that, by the time you have a table, you have a fully searchable glossary as well.



If you end up working with Trados, know that you can create a multiterm file by importing the table you get.



All major CATs (DV, Wordfast, Trados, SDLX, Transit ...) have some sort of a termbase, and all of these can import tab/comma delimited text - which is what you get after OCR, I guess.



To my knowledge, the easiest way to work with and to handle terminology in is Wordfast. Totally open formats, and you can use your glossaries to check your translations in real time.



But in all cases, all major CATs can use your OCR output - just a matter of how much manipulations you will have to do to fit it in. (Trados Multiterm import function is a pain, imo).



Beside CATs, there are a number of dictionary applications out there, and most will handle tab delimited text as well.



Really, the point is more to get a feel of the different existing CATs, and see which one you are most comfortable with. They can all do the stuff you need.



Hope this helps

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Best way to create a termbase from OCR input?

Advanced search


Translation news related to CAT tools





Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
PDF Translation - the Easy Way
TransPDF converts your PDFs to XLIFF ready for professional translation.

TransPDF converts your PDFs to XLIFF ready for professional translation. It also puts your translations back into the PDF to make new PDFs. Quicker and more accurate than hand-editing PDF. Includes free use of Infix PDF Editor with your translated PDFs.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs