This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
I have a paralel corpus and I need to align it for term extraction purposes. The term extractor that I plan to use requires for the input corpus to have the following format: Original sentence tab Translated sentence return.
I have tried several aligners (WinAlling, Transit) but neither of them seem to do the work for me. WinAllign inserts different tags which I don't need and with Transit I just haven't been able to export the aligned files, so I don't know wh... See more
Hi all,
I have a paralel corpus and I need to align it for term extraction purposes. The term extractor that I plan to use requires for the input corpus to have the following format: Original sentence tab Translated sentence return.
I have tried several aligners (WinAlling, Transit) but neither of them seem to do the work for me. WinAllign inserts different tags which I don't need and with Transit I just haven't been able to export the aligned files, so I don't know what the output format is.
Anyway, I wanted to ask if anybody of you knows a software (preferably free) which can align a paralel text and output the alignment in the format that I specified above: Original sentence tab Translated sentence return (or ant other format wich could be converted to this one automatically).
I would really appreciate some help with this problem.
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Samuel Murray Netherlands Local time: 19:55 Member (2006) English to Afrikaans + ...
PlusTools
Mar 31, 2008
javitxu wrote: I have a paralel corpus and I need to align it for term extraction purposes. The term extractor that I plan to use requires for the input corpus to have the following format: Original sentence tab Translated sentence return.
The format in which PlusTools' aligner puts the text just before it exports it to TM, is a simple, two-column MS Word table. PlusTools is free. To the question "has this text been segmented by Wordfast", answer "yes" if your input text is already segmented (by sentence or by paragraph), but if you want PlusTools to attempt the segmentation, answer "no". If PlusTools' segmentation attempts fail to inspire, use Wordfast to "Extract" the text first (and in Wordfast you tweak the segmentation).
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
javitxu wrote: WinAllign inserts different tags which I don't need…
So the job has ben done once? Instead of looking for some new piece of software instead of doing what is paid for, I'd rather simply remove the nasty tags. Such procedure usually takes 1 to 10 minutes in Word, depending on tags format.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Your replies really helped me. The corpus is too big for me to delete the tags inserted by WillAlign manually, so I think I will finally use Plus Tools. That was really what I was looking for.
Thanks again, Javi
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.