SDL Trados and a major problem with segmentation
Thread poster: Masoud Kakoli

Masoud Kakoli
Iran
Local time: 00:47
English to Persian (Farsi)
+ ...
Jan 27, 2015

I am relatively new to computer-assisted translation such as SDL Trados. I installed SDL Trados and read the pertinent help files. I imported a file to translate but I ran into a problem. The software could not isolate the sentences correctly. I mean the software does not breaks the sentence correctly and cannot place the whole sentence into one source text segment. For e.g, the software breaks the sentence in the middle of the sentence and not at the end of the sentence. I also could not find a way to edit the sentence manually and place the whole sentence into one source text segment. What can I do to solve the issue? Should I do anything before importing the files? Can I edit the sentences manually? Note that I ran into this issue for a few sentences or even less than that.

[Edited at 2015-01-27 15:16 GMT]

[Edited at 2015-01-27 15:17 GMT]

[Edited at 2015-01-27 15:18 GMT]


 

Emma Goldsmith  Identity Verified
Spain
Local time: 22:17
Member (2010)
Spanish to English
Hard returns? Jan 27, 2015

What source file format are you processing?

If it's Word or Ppt, you'll probably find that there are hard returns (carriage returns) at the end of some lines and Studio uses these to segment the text.

You'll need get rid of rogue hard returns in the source file before you process it in Studio.


 

Masoud Kakoli
Iran
Local time: 00:47
English to Persian (Farsi)
+ ...
TOPIC STARTER
Hard return Jan 27, 2015

Emma Goldsmith wrote:

What source file format are you processing?

If it's Word or Ppt, you'll probably find that there are hard returns (carriage returns) at the end of some lines and Studio uses these to segment the text.

You'll need get rid of rogue hard returns in the source file before you process it in Studio.


I imported a Word file. I couldn't understand what you mean by "hard return". Can you cite a vivid example?


 

Emma Goldsmith  Identity Verified
Spain
Local time: 22:17
Member (2010)
Spanish to English
Line break Jan 27, 2015

Masoud Kakoli wrote:

I couldn't understand what you mean by "hard return". Can you cite a vivid example?


A hard return is a line break. In your file, you may find them at the end of lines where they shouldn't be. Here's a vivid example:



 

Roy Oestensen  Identity Verified
Norway
Local time: 22:17
Member (2010)
English to Norwegian (Bokmal)
+ ...
Beware of unwanted line breaks and paragraph markers Jan 27, 2015

Masoud Kakoli wrote:

I imported a Word file. I couldn't understand what you mean by "hard return". Can you cite a vivid example?


Actually you should be aware that there are two types of hard returns in Word. The first is a line break, and the other a paragraph marker, which is what Emma was referring to.

You should get rid of unwanted line breaks and paragraph markers, and you can do it the following way.

Searching for line breaks: Search for ^l (think of l as short for "line" Replace with space or nothing (nothing may be dangerous, as it may mean the last word on one line is glued to the first word on the next line).
Searching for Paragraph markers: Search for ^p (p for paragraph). Replace with: same as line breaks.


 

Masoud Kakoli
Iran
Local time: 00:47
English to Persian (Farsi)
+ ...
TOPIC STARTER
How to solve the issue Jan 27, 2015

I tried what you and Emma told me but I could not solve the issue. When I search ^l, Word finds nothing. On the other hand when I search ^p, Word finds end of each paragraph. Compare these two photos with each other and you understand the the first and the second segment should be placed in one segment. How can I solve this issue?


image upload no resize

post a picture


 

Emma Goldsmith  Identity Verified
Spain
Local time: 22:17
Member (2010)
Spanish to English
Segmentation by colon Jan 28, 2015

Great, now we can see that Studio is using colons as a segmentation symbol. A picture is worth a thousand words!

By default, Studio will start a new segment when it finds a colon followed by a space and a capital letter.

In your current project, you can solve this by merging segments together:

Click on segment no. 1 (click on the number 1 itself in the left-hand column).
Move your mouse to the segment below and press ctrl+click to select the two segments you want to merge.
Right click in the same place and select "merge segments".

In the future, you should change this segmentation rule if you don't want it, by going to your general settings: File>settings>options>language pairs>translation memories.
Click on your first TM in the list on the right and then on Settings.
Go to Language Resources>Segmentation Rules>Edit
Select "colon" and then Remove.

Think carefully before you do this, because it is often helpful to start a new segment where there is a colon before a capital letter. In my opinion it would be better to merge segments manually when needed, rather than changing this particular segmentation rule.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

SDL Trados and a major problem with segmentation

Advanced search







SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search