What is the correct way to merge or split segments in WinAlign?
Thread poster: Tomas Mosler, DipTrans IoLET MCIL MITI

Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 14:02
Member (2008)
English to Czech
Sep 17, 2009

Hello,

I have (example) this in source RTF:
Today
Total

and equivalents to in in target RTF. I want to align the lines and create a TM from that.

But when I process the files with WinAlign, some separated lines / segments from source get merged into one segment, i.e.
Today Total

Then of course the source and target segments don't match.

How can I force WinAlign to respect the new line/paragraph precisely? I tried various alignment tuning settings, but to no complete success...

Thanks for any hints.

[Subject edited by staff or moderator 2009-09-17 11:00 GMT]


Direct link Reply with quote
 

Astrid Elke Witte  Identity Verified
Germany
Local time: 14:02
Member (2002)
German to English
+ ...
You need to undertake some manual adjustment to the segments afterwards Sep 17, 2009

Hi Tomas,

It is not very likely that you will get perfect alignment done automatically. In most cases it is necessary to make some manual adjustments afterwards.

To split a segment:

Go to the menu tab "Edit"
Choose "Edit segment", and then "Quick edit".
Then right click in the segment where you would like it to be split, e.g. between "Today" and "Total", so that you see the marker in the right place. Next choose "Split segment" from the menu that you get when you right-click.

To join two segments:

Hold down the Ctrl key in order to highlight both segments.
Choose "Join segments" from the list of options that you get when you right-click.


You can also split a segment the short way, as I have described for joining segments. Both of these methods exist.

Astrid

[Edited at 2009-09-17 11:11 GMT]

[Edited at 2009-09-17 11:12 GMT]

[Edited at 2009-09-17 11:16 GMT]


Direct link Reply with quote
 

Attila Piróth  Identity Verified
France
Local time: 14:02
Member
English to Hungarian
+ ...
Give PlusTools a try Sep 17, 2009

Hi Tomas,

The +Align component of PlusTools prepares a two-column table in Word, and it takes just one click to merge two cells in one column (or split one cell into two). The segmentation for alignment is customizable – and you can also use the usual tricks in Word that are not available in Winalig due to its proprietary format.
To give an example: you aligned the files and found that there a lot source-language segments were split after a not very common abbreviation, such as "Dir." (for director), followed by a capitalized name. Instead of redefining the segmentation rules by adding "Dir." to the list of exceptions (which is an available option in Winalign as well), you can simply do this:
1.) Copy the source-language column to a new Word file
2.) Convert the table to a paragraph-delimited text (i.e., the cell boundaries will be replaced by paragraph marks) [provided paragraph marks are not used in the cells of the table];
3.) Replace "Dir.^p" (Dir followed by a paragraph break) by "Dir." (i.e., here you eliminate the paragraph break after this single abbreviation)
4.) Convert the text back to a one-column table
5.) Paste this column back to the original Word file.

This 5-step operation will take less than half a minute -- so if the same problem occurs 20 times, it is a very time-efficient remedy.

However, even without using Word's powerful F/R functions, merging and splitting manipulations in PlusTools are considerably faster than in WinAlign.

PlusTools can handle only Word files -- I have not checked out the alignment options in Wordfast Pro. I would expect a quite similar approach, and so a similar efficiency. So, I would strongly give you to give it a try if it is not for a one-off alignment task.

Kind regards,
Attila


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 14:02
Member (2004)
English to Slovenian
+ ...
if it's systematic, adjust segmentation rules Sep 17, 2009


I have (example) this in source RTF:
Today
Total
...
some separated lines / segments from source get merged into one segment, i.e.
Today Total

If I remember correctly, you can tweak segmentation rules for both the source and the target language. When one segments on ":", and the other one does not (and that should be evident from the joined segments), you can adjust accordingly. One more snag is hard vs soft break - check RTFs, to see what kind of "new lines" are there actually in the files.

By the way, whatever tools you use, make abundant backups: one global search and replace too much can destroy a lot of your past work.

Using WinAlign is not easy, but it does pay...

regards

Vito


Direct link Reply with quote
 

Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 14:02
Member (2008)
English to Czech
TOPIC STARTER
not perfect Sep 17, 2009

The problem is that the output segments are UNaligned so radically, that spending time on aligning them manually would maybe cost me more time than doing the revision in the source (Excel) file. I managed to make one-sentence segements OK (hopefully) using tables (!) in Word, but when it comes to something more complex... No way. So for now I have about 98 % segments in TM, and the rest (maybe 50) I will copy or check separately.

Vito, hard breaks look to be in match (for example in case of the "sample" metioned above).


Direct link Reply with quote
 
FarkasAndras
Local time: 14:02
English to Hungarian
+ ...
aligning stuff... Sep 17, 2009

Winalign is just a bad solution for it unless you have small quantities on your hands and/or lots of patience.

I just aligned about 200,000 TUs yesterday... try doing that with Winalign. I can't speak highly enough about Hunalign. Obviously, there will be misaligned segments in this much material but I know from spot checks that it's well under 1%
I've posted quite a bit about hunalign before, look it up if you like.

Edit: in this specific case it may not have been much use.
You're much better off doing this sort of stuff in a spreadsheet program or something similar.
E.g. if it's from some sort of table that had line breaks within cells you may want to remove soft line breaks and then put the whole thing in Excel side by side to check.

[Edited at 2009-09-17 18:34 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What is the correct way to merge or split segments in WinAlign?

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search