Mobile menu

Anyone have a corpus of Italian?
Thread poster: Colin Ryan
Colin Ryan  Identity Verified
Local time: 20:21
Italian to English
+ ...
Mar 16, 2005

Hi,
I'm looking for a corpus of spoken and written Italian. Ideally it should be XML- or SGML-tagged but really any old corpus will do at the moment. Anyone know where I might find one?
Colm


Direct link Reply with quote
 
Sara Castagnoli
Local time: 20:21
English to Italian
+ ...
Italian newspaper corpus Mar 16, 2005

Ryan,
I've just posted a reply in the Italian thread, but here are the main points:

- "La Repubblica" corpus is made of 10 years of the Italian newspaper La Repubblica, collected at the School for Translators at the University of Bologna

- you cannot download it, but it's freely accessible after registration at http://sslmitdev-online.sslmit.unibo.it/corpora/corpus.php?path=&name=Repubblica

- it's XML tagged

- it includes POS-tagging and lemmatisation, together with some text categorisation (topic, genre, year etc.)

Hope this helps,
Sara


Direct link Reply with quote
 
paolamonaco  Identity Verified
Italy
Local time: 14:21
English to Italian
+ ...
have a look Mar 16, 2005

here's some links to corpora of written and spoken italian edited by several universities

ftp://ftp.cirass.unina.it/
http://corpus.cilta.unibo.it:8080/coris_eng.html
http://www.alphabit.net/Corsi/IUlinks/CorporaList.htm#italiano

Hope it helps
Paola


Direct link Reply with quote
 

Jeff Allen  Identity Verified
France
Local time: 20:21
Member (2011)
Multiplelanguages
+ ...
Italian language corpora Mar 16, 2005

look at the ELDA/ELRA catalog:

http://www.elda.org/
go to catalog of language resources.

Jeff
http://www.geocities.com/langresourcesallen/


Direct link Reply with quote
 
Colin Ryan  Identity Verified
Local time: 20:21
Italian to English
+ ...
TOPIC STARTER
Thanks, all! Mar 18, 2005

Hi,
Just wanted to say, thanks a million for all your suggestions. They were extremely useful. Much obliged!
Colm


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Anyone have a corpus of Italian?

Advanced search






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs