Regex glossaries?
Thread poster: Piotr Bienkowski

Piotr Bienkowski  Identity Verified
Poland
Local time: 21:01
Member (2005)
English to Polish
+ ...
Jan 28, 2016

I want to learn something new but can't find the resources.

I want to use the regex glossary feature to be able to quickly insert all uppercase words such as ZORDRETSPECSALES. So I created a glossary and entered

[A-Z]+{tab}$0

Reloaded the glossary. Nothing happens i.e. the uppercase word is not highlighted. Regex wrong or something else?

{tab} represents the tab character.


Direct link Reply with quote
 

FarkasAndras
Local time: 21:01
English to Hungarian
+ ...
? Jan 28, 2016

I don't know CT so I can't offer much help, I'm sure CT users will show up with insider info any minute now.
Still:
1) A-Z usually just matches latin characters, i.e. it won't match É or Ł. This varies tool to tool and should be tested. Possible expressions for matching all uppercase letters include [[:upper:]]+ and \p{Uppercase}+. Check the CT docs for info.
2) What's the $0? $ usually stands for "end of string". $1 stands for "previous captured string" but that would require you to put parens around something before. E.g. ([A-Z]+){tab}$1

[Edited at 2016-01-28 16:13 GMT]


Direct link Reply with quote
 

Selcuk Akyuz  Identity Verified
Turkey
Local time: 23:01
Member (2006)
English to Turkish
+ ...
nontranslatables Jan 28, 2016

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0


Direct link Reply with quote
 

Piotr Bienkowski  Identity Verified
Poland
Local time: 21:01
Member (2005)
English to Polish
+ ...
TOPIC STARTER
I copied you list.... Jan 28, 2016

Closed the project and reopened it, but only completely restarting Cafetran did the trick. Now I don't know if it is your list that works or my regex glossary

Selcuk Akyuz wrote:

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0





Direct link Reply with quote
 

Selcuk Akyuz  Identity Verified
Turkey
Local time: 23:01
Member (2006)
English to Turkish
+ ...
my list possibly Jan 28, 2016

You can test it, simply close your glossary and if they are still highlighted then it is proof that nontranslatables are working.

A version of your regex ([A-Z]+) or ([A-Z][A-Z]+) don't remember which one now, highlights matches but you should right click on glossary and select match case.

I could not find a regex to highlight É, the following one failed:

|\b[\p{Upper}]\b

-----------------------

Found it

|\b[\p{Lu}]\b






[Edited at 2016-01-28 20:32 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Regex glossaries?

Advanced search






SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search