What is the correct way to merge or split segments in WinAlign?
Thread poster: Tomas Mosler, DipTrans IoLET MCIL MITI

Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 20:55
Member (2008)
English to Czech
Sep 17, 2009

Hello,

I have (example) this in source RTF:
Today
Total

and equivalents to in in target RTF. I want to align the lines and create a TM from that.

But when I process the files with WinAlign, some separated lines / segments from source get merged into one segment, i.e.
Today Total

Then of course the source and target segments don't match.

How can I force WinAlign to respect the new line/paragraph precisely? I tried various alignment tuning settings, but to no complete success...

Thanks for any hints.

[Subject edited by staff or moderator 2009-09-17 11:00 GMT]


 

Astrid Elke Witte  Identity Verified
Germany
Local time: 20:55
Member (2002)
German to English
+ ...
You need to undertake some manual adjustment to the segments afterwards Sep 17, 2009

Hi Tomas,

It is not very likely that you will get perfect alignment done automatically. In most cases it is necessary to make some manual adjustments afterwards.

To split a segment:

Go to the menu tab "Edit"
Choose "Edit segment", and then "Quick edit".
Then right click in the segment where you would like it to be split, e.g. between "Today" and "Total", so that you see the marker in the right place. Next choose "Split segment" from the menu that you get when you right-click.

To join two segments:

Hold down the Ctrl key in order to highlight both segments.
Choose "Join segments" from the list of options that you get when you right-click.


You can also split a segment the short way, as I have described for joining segments. Both of these methods exist.

Astrid

[Edited at 2009-09-17 11:11 GMT]

[Edited at 2009-09-17 11:12 GMT]

[Edited at 2009-09-17 11:16 GMT]


 

Attila Piróth  Identity Verified
France
Local time: 20:55
Member
English to Hungarian
+ ...
Give PlusTools a try Sep 17, 2009

Hi Tomas,

The +Align component of PlusTools prepares a two-column table in Word, and it takes just one click to merge two cells in one column (or split one cell into two). The segmentation for alignment is customizable – and you can also use the usual tricks in Word that are not available in Winalig due to its proprietary format.
To give an example: you aligned the files and found that there a lot source-language segments were split after a not very common abbreviation, such as "Dir." (for director), followed by a capitalized name. Instead of redefining the segmentation rules by adding "Dir." to the list of exceptions (which is an available option in Winalign as well), you can simply do this:
1.) Copy the source-language column to a new Word file
2.) Convert the table to a paragraph-delimited text (i.e., the cell boundaries will be replaced by paragraph marks) [provided paragraph marks are not used in the cells of the table];
3.) Replace "Dir.^p" (Dir followed by a paragraph break) by "Dir." (i.e., here you eliminate the paragraph break after this single abbreviation)
4.) Convert the text back to a one-column table
5.) Paste this column back to the original Word file.

This 5-step operation will take less than half a minute -- so if the same problem occurs 20 times, it is a very time-efficient remedy.

However, even without using Word's powerful F/R functions, merging and splitting manipulations in PlusTools are considerably faster than in WinAlign.

PlusTools can handle only Word files -- I have not checked out the alignment options in Wordfast Pro. I would expect a quite similar approach, and so a similar efficiency. So, I would strongly give you to give it a try if it is not for a one-off alignment task.

Kind regards,
Attila


 

Vito Smolej
Germany
Local time: 20:55
Member (2004)
English to Slovenian
+ ...
if it's systematic, adjust segmentation rules Sep 17, 2009


I have (example) this in source RTF:
Today
Total
...
some separated lines / segments from source get merged into one segment, i.e.
Today Total

If I remember correctly, you can tweak segmentation rules for both the source and the target language. When one segments on ":", and the other one does not (and that should be evident from the joined segments), you can adjust accordingly. One more snag is hard vs soft break - check RTFs, to see what kind of "new lines" are there actually in the files.

By the way, whatever tools you use, make abundant backups: one global search and replace too much can destroy a lot of your past work.

Using WinAlign is not easy, but it does pay...

regards

Vito


 

Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 20:55
Member (2008)
English to Czech
TOPIC STARTER
not perfect Sep 17, 2009

The problem is that the output segments are UNaligned so radically, that spending time on aligning them manually would maybe cost me more time than doing the revision in the source (Excel) file. I managed to make one-sentence segements OK (hopefully) using tables (!) in Word, but when it comes to something more complex... No way. So for now I have about 98 % segments in TM, and the rest (maybe 50) I will copy or check separately.

Vito, hard breaks look to be in match (for example in case of the "sample" metioned above).


 

FarkasAndras
Local time: 20:55
English to Hungarian
+ ...
aligning stuff... Sep 17, 2009

Winalign is just a bad solution for it unless you have small quantities on your hands and/or lots of patience.

I just aligned about 200,000 TUs yesterday... try doing that with Winalign. I can't speak highly enough about Hunalign. Obviously, there will be misaligned segments in this much material but I know from spot checks that it's well under 1%
I've posted quite a bit about hunalign before, look it up if you like.

Edit: in this specific case it may not have been much use.
You're much better off doing this sort of stuff in a spreadsheet program or something similar.
E.g. if it's from some sort of table that had line breaks within cells you may want to remove soft line breaks and then put the whole thing in Excel side by side to check.

[Edited at 2009-09-17 18:34 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What is the correct way to merge or split segments in WinAlign?

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search