Pages in topic:   [1 2] >
软件翻译中,用像素判断字段超长靠谱吗?
Thread poster: Li Jie

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
Dec 24, 2014

先祝大家圣诞快乐!新年快乐!

今年做了一个软件翻译,客户和翻译公司用工具检查字段是否超长。据翻译公司说,那个工具判断超长的标准是像素,给出的log文件中都是一些看起来中英文所占宽度差不多的翻译。比如:log files 翻译成"日志文件"。我想知道用像素判断字段超长靠谱吗?从前做软件翻译,客户是以字节数判断字段是否超长的,我觉得翻译可能太长时,自己也会算一下。比如上例,英文占 9 个字节,中文占 8 个字节,应该是不超长的。所以我怀疑这个工具有误报。

另外,客户给的log文件中不能正确显示中文字符,我曾经告他们检查选择的字符集(如果能选的话),他们说选择是正确的,我在想是否这也是造成误报的原因。

希望我描述清楚自己的问题了。


Direct link Reply with quote
 

J.H. Wang
China
Local time: 18:11
Member (2007)
English to Chinese
+ ...
以前还没有听说过这种方法 Dec 24, 2014

感觉就像是用尺子量长短一样,呵呵。做得好的话,应该可以的。

Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
谢谢回复 Dec 24, 2014

J.H. Wang wrote:

感觉就像是用尺子量长短一样,呵呵。做得好的话,应该可以的。


除了build到软件里进行翻译测试以外,有些公司是会提前想些办法排除翻译超长的,我只是纳闷,像素怎么计算超长,同样一个汉字,"一"和"曦"的像素会不会差很多?

已经发了邮件给翻译公司,问她用什么工具,回复说是客户的proprietory tool,而且非拉丁文都有我遇到的问题,应该是他们开发的工具不灵光。翻译公司让我能改就改一些。我明儿有时间就改改,没有就如实告她,不在这个上费时间了。


Direct link Reply with quote
 

wherestip  Identity Verified
United States
Local time: 04:11
Chinese to English
+ ...
Alternative method to word counting Dec 24, 2014

Li Jie wrote:

我只是纳闷,像素怎么计算超长,同样一个汉字,"一"和"曦"的像素会不会差很多?



Li Jie,

Long time, no see. I hope everything is going well for you.

From reading what you described, my guess is that the operative word here is width. The horizontal pixel length of the two Chinese characters "一" and "曦" are actually roughly the same. I could be wrong, but the Chinese character "了" might be the character that occupies the least horizontal pixel space compared to all others.

In English, the lowercase letters "i" and "l", the exclamation mark "!", and the separator "|" obviously all occupy the least width in horizontal pixels. The uppercase "W", on the other hand, probably sits on the other end of the spectrum. But all in all, For English texts that have a certain word count, everything in total pixel length should all average out.

Anyway, it's probably a tool just to automate the irksome word-counting issue in the English and Chinese pair. Like J. H. said, it's like literally attempting to measure with a ruler/yardstick the total length of all the solid pixels, with the goal of leaving out all the whitespaces.


[Edited at 2014-12-24 16:48 GMT]


Direct link Reply with quote
 

lbone  Identity Verified
China
Local time: 18:11
English to Chinese
+ ...
比较凌乱 Dec 24, 2014

在编程里,检测字符串长度是一项最基本的工作,有专门的函数干这活,一般来说检测的是字节数。一般来说,一个英文字母为一个字节长,一个中文字符为2个字节长,中文字符又称为宽字节或双字节字符,就是这个原因。

如果确实如你所说,要测的是字段(field)的长度的话……在软件界面上,字段一般指界面上的界面控件/界面元素,也有人用它指这个界面元素里的那个参数\变量,或它的值,在数据库和数据集里,一个数据属性或表格里的一列就是一个字段。所以说到一个字段大概可能有三种意思:
1)一个参数,或数据集里的一个属性
2)软件/网页界面里容纳这个参数的那个输入控件/界面元素或表格里的相应的列
3)这个参数或属性的值,这个第三条照理不是字段本身的定义,只是字段的值,但常有人说字段时指这个意思。

如果是(2)界面元素或表格中列的宽度等等涉及空间尺度的,这种字段(空间元素)的长度就是按像素算的。
如果是(1)或(3),它指一串字符、数字或其他类型的数据,但就算其他类型,多半还是写为一串字符串或数字的样子。这种情况,长度一般按字节算。

在界面或表格(表格也是一种界面)里的一个字符按像素算有多宽,和它的字体有关。有些字体是等宽字体,这时所有单字节字符都一样宽,所有双字节字符都是两个单字节字符那么宽。也有些字体是不等宽字体,那不一样的英文字母或汉字,就不一样宽了。

这和你这个翻译也没关系吧。我觉得翻译需要关心的主要还是按字节算的字符串的长度。

[Edited at 2014-12-24 13:56 GMT]


Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
Thanks & Merry Christmas :-) Dec 24, 2014


Li Jie,

Long time, no see. I hope everything is going well for you.

From reading what you described, my guess is that the operative word here is width. The horizontal pixel length of the two Chinese characters "一" and "嚱" are actually roughly the same. I could be wrong, but the Chinese character "了" might be the character that occupies the least horizontal pixels compared to all others.

In English, the lowercase letters "i" and "l", the exclamation mark "!", and the separator "|" obviously all occupy the least width in horizontal pixels. The uppercase "W", on the other hand, probably sits on the other end of the spectrum. But all in all, For English texts that have a certain word count, everything in total pixel length should all average out.

Anyway, it's probably a tool just to automate the irksome word-counting issue in the English and Chinese pair. Like J. H. said, it's like literally attempting to measure with a yardstick the total length of all the solid pixels, with the goal of leaving out all the white spaces.


Steven,

It's great to hear from you in this holiday. I hope everything is going well for you, too!

Many thanks for your explanation for horizontal pixels. It really helps.

I have one more question: according to your explanation, all the white spaces will be left out. Does that mean a blank space will not be taken into account in their pixel length? Actually if counted by byte, a blank space will be considered as occupying a byte in the string. Maybe that's why I think some strings are OK but their tool reports overlength. Am I right?


Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
谢谢 lbone,是字符串长度 Dec 24, 2014

lbone wrote:

在编程里,检测字符串长度是一项最基本的工作,有专门的函数干这活,一般来说检测的是字节数。一般来说,一个英文字母为一个字节长,一个中文字符为2个字节长,中文字符又称为宽字节或双字节字符,就是这个原因。

如果确实如你所说,要测的是字段(field)的长度的话……在软件界面上,字段一般指界面上的界面控件/界面元素,也有人用它指这个界面元素里的那个参数\变量,或它的值,在数据库和数据集里,一个数据属性或表格里的一列就是一个字段。所以说到一个字段大概可能有三种意思:
1)一个参数,或数据集里的一个属性
2)软件/网页界面里容纳这个参数的那个输入控件/界面元素或表格里的相应的列
3)这个参数或属性的值,这个第三条照理不是字段本身的定义,只是字段的值,但常有人说字段时指这个意思。

如果是(2)界面元素或表格中列的宽度等等涉及空间尺度的,这种字段(空间元素)的长度就是按像素算的。
如果是(1)或(3),它指一串字符、数字或其他类型的数据,但就算其他类型,多半还是写为一串字符串或数字的样子。这种情况,长度一般按字节算。

在界面或表格(表格也是一种界面)里的一个字符按像素算有多宽,和它的字体有关。有些字体是等宽字体,这时所有单字节字符都一样宽,所有双字节字符都是两个单字节字符那么宽。也有些字体是不等宽字体,那不一样的英文字母或汉字,就不一样宽了。

这和你这个翻译也没关系吧。我觉得翻译需要关心的主要还是按字节算的字符串的长度。

[Edited at 2014-12-24 13:56 GMT]


谢谢 lbone 的详细解释。

抱歉,我开贴时用词不准确,就是字符串长度。早前做软件翻译,也都是按字节算字符串长度的。只有这个客户说用像素计算。我之所以问,是因为他们的工具生成 100 多条 truncated strings 要我改。 我随机挑了一些按字节数算了一下,并没有超长。我已经有 10 年没做软件翻译了,生怕现在有什么新方法/工具我不了解。

另外,报告中不能正确显示中文字符应该也是导致误报的原因,乱码占的字节比正常中文多很多。


Direct link Reply with quote
 

wherestip  Identity Verified
United States
Local time: 04:11
Chinese to English
+ ...
Leaving out all the blank spaces from the total Dec 24, 2014

Li Jie wrote:

I have one more question: according to your explanation, all the white spaces will be left out. Does that mean a blank space will not be taken into account in their pixel length? Actually if counted by byte, a blank space will be considered as occupying a byte in the string. Maybe that's why I think some strings are OK but their tool reports overlength. Am I right?



Li Jie,

You're correct. Assuming that the design concept of the tool works the way we think it does, none of the horizontal whitespaces will count towards the total horizontal pixel length, effectively discounting all the blank spaces in a text.

Merry Christmas to you too.


~*~*~*~*~*

p.s., "嚱" and "曦" look to be the same character to me without going the extra step of zooming in. That's how good my eyesight is. For traditional Chinese characters, the details get a little blurry when the font is too small. But I'm still pretty happy with my current eyesight.


[Edited at 2014-12-24 19:19 GMT]


Direct link Reply with quote
 

lbone  Identity Verified
China
Local time: 18:11
English to Chinese
+ ...
客户的问题 Dec 24, 2014

Li Jie wrote:

谢谢 lbone 的详细解释。

抱歉,我开贴时用词不准确,就是字符串长度。早前做软件翻译,也都是按字节算字符串长度的。只有这个客户说用像素计算。我之所以问,是因为他们的工具生成 100 多条 truncated strings 要我改。 我随机挑了一些按字节数算了一下,并没有超长。我已经有 10 年没做软件翻译了,生怕现在有什么新方法/工具我不了解。

另外,报告中不能正确显示中文字符应该也是导致误报的原因,乱码占的字节比正常中文多很多。



有很多小公司第一次做国际化,很多东西都还不太懂。做翻译做多了,难免会碰到一些这样不太懂的客户。
一般而言,如果你译为中文的话,字节数和长度都比英文少,所以做中文本地化或说英译中时,很少会碰到需要调整译文字数的情况。明白的开发者会在界面上留出足够的宽度,而不是让其他语言的译者去缩字。如果连中文都不够放,其他语言的麻烦就大了。
确实如steve所说,就算使用了非等宽字体(比例字体),不同字符的宽度有些差异,但字数稍多一点时,会有平均的效果,实际上最后宽度的比例关系大致还是和字节数的对比关系比较契合。客户有100多条需要你改,应该还是他们开发国际软件的经验不足,为意外情况预留的空间少了(短了)。

译文哪能随便改啊,我碰到这种情况,一般会回信强烈呼吁他们去和开发者反映,把字段的空间留大点。

[Edited at 2014-12-24 16:09 GMT]


Direct link Reply with quote
 
QHE
United States
Local time: 05:11
English to Chinese
+ ...
Measuring Text Dec 24, 2014

For Microsoft Word 2007 or 2010

1. Open Microsoft Word

2. Click the Word Options button.

3. Select Advanced in the left pane.

4. Scroll down to the Display section.

5. Use the Show measurements in units of drop-down to select from Inches, Centimeters, Millimeters, Points, or Picas.

6. Click OK.

Convert between picas (computer) and pixels
http://www.unitconversion.org/typography/picas-computer-to-pixels-y-conversion.html


*** *** ***


Measuring Text (API)
http://docs.oracle.com/javase/tutorial/2d/text/measuringtext.html

Measure String Method
http://msdn.microsoft.com/en-us/library/aa327655(v=vs.71).aspx

Determine Pixel Length of String in Javascript/jQuery?
http://stackoverflow.com/questions/2057682/determine-pixel-length-of-string-in-javascript-jquery

Get the width of a string in pixels (PHP)
http://www.simplemachines.org/community/index.php?topic=122913.0





[Edited at 2014-12-24 18:14 GMT]


Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
Thank you. Dec 24, 2014


Li Jie,

You're correct. Assuming that the design concept of the tool works the way we think it does, none of the horizontal whitespaces will count towards the total horizontal pixel length, effectively discounting all the blank spaces in a text.

Merry Christmas to you too.


~*~*~*~*~*

p.s., "嚱" and "曦" look to be the same character to me without going the extra step of zooming in. That's how good my eyesight is. For traditional Chinese characters, the details get a little blurry when the font is too small. But I'm still pretty happy with my current eyesight.


[Edited at 2014-12-24 19:19 GMT]


Many thanks, Steve, for your explanation! My apologies for calling you Steven.

[修改时间: 2014-12-24 23:53 GMT]


Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
是的。 Dec 25, 2014

lbone wrote:

有很多小公司第一次做国际化,很多东西都还不太懂。做翻译做多了,难免会碰到一些这样不太懂的客户。
一般而言,如果你译为中文的话,字节数和长度都比英文少,所以做中文本地化或说英译中时,很少会碰到需要调整译文字数的情况。明白的开发者会在界面上留出足够的宽度,而不是让其他语言的译者去缩字。如果连中文都不够放,其他语言的麻烦就大了。
确实如steve所说,就算使用了非等宽字体(比例字体),不同字符的宽度有些差异,但字数稍多一点时,会有平均的效果,实际上最后宽度的比例关系大致还是和字节数的对比关系比较契合。客户有100多条需要你改,应该还是他们开发国际软件的经验不足,为意外情况预留的空间少了(短了)。

译文哪能随便改啊,我碰到这种情况,一般会回信强烈呼吁他们去和开发者反映,把字段的空间留大点。

[Edited at 2014-12-24 16:09 GMT]


是的,一般都是一些单独的介词(比如:by,on,at 单独做一个字符串)和四个字母组成的词比较容易超过宽度。还有就是缩略语,中文没法缩,一缩就没意义了。

最终客户应该对处理双字节文字没什么经验。好像没有设对字符集,所以log里的中文全是乱码,而且乱码字符数比正常字符多。

有些客户不太理睬翻译的建议,而翻译开了头又不能撂挑子。今年我还遇到一个项目,翻译公司和客户不知道怎么想的,把所有文件中涉及的字符串全丢到一个Excel里,居然还排序。源文件里那点可怜的context一下子全部打乱了。当时我就和客户说,你还不如直接给 properties 文件我来翻译呢。这样翻译后面测试的工作量会很大。客户虽然很客气,但是表示现在只能这样。搞得我翻译时提了200多条问题,问什么词用在哪儿。后面还断断续续跟了两个月测试。不过他们照付我的工时,我也只能这么跟着他们的路子走。


Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
Many thank! Dec 25, 2014

QHE wrote:

For Microsoft Word 2007 or 2010

1. Open Microsoft Word

2. Click the Word Options button.

3. Select Advanced in the left pane.

4. Scroll down to the Display section.

5. Use the Show measurements in units of drop-down to select from Inches, Centimeters, Millimeters, Points, or Picas.

6. Click OK.

Convert between picas (computer) and pixels
http://www.unitconversion.org/typography/picas-computer-to-pixels-y-conversion.html


*** *** ***


Measuring Text (API)
http://docs.oracle.com/javase/tutorial/2d/text/measuringtext.html

Measure String Method
http://msdn.microsoft.com/en-us/library/aa327655(v=vs.71).aspx

Determine Pixel Length of String in Javascript/jQuery?
http://stackoverflow.com/questions/2057682/determine-pixel-length-of-string-in-javascript-jquery

Get the width of a string in pixels (PHP)
http://www.simplemachines.org/community/index.php?topic=122913.0





[Edited at 2014-12-24 18:14 GMT]


Many thanks, QHE. Merry Christmas!


Direct link Reply with quote
 
QHE
United States
Local time: 05:11
English to Chinese
+ ...
You’re welcome Dec 26, 2014

Li Jie wrote:

Many thanks, QHE. Merry Christmas!


Thanks, Li Jie.

Wishing you and your family a happy peaceful New Year!

[Edited at 2014-12-26 16:01 GMT]


Direct link Reply with quote
 

Li Jie  Identity Verified
China
Local time: 18:11
English to Chinese
TOPIC STARTER
Thank you. Dec 29, 2014

QHE wrote:

Thanks, Li Jie.

Wishing you and your family a happy peaceful New Year!

[Edited at 2014-12-26 16:01 GMT]


Thank you.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

软件翻译中,用像素判断字段超长靠谱吗?

Advanced search






BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search