ISO/R9 --> ISO 14651
Lees onderstaande eens (goede referentie! - via www.ixquick.com): ISO/R9 is eigenlijk heel oubollig en niet meer beschikbaar.
Je moet het derhalve zoeken bij ISO 14651;
Ga bv. naar www.google.com en voer ISO 14651 in voor talloze referenties.
Title: UCS generic collation locale -- rationale for Cyrillic
Source: Michael Everson, EGT (IE)
Status: Expert Contribution
Action: For consideration by SC22/WG20
Distribution: SC22/WG20, UTC
Recently Johan van Wingen posted a number of queries regarding the ordering of Cyrillic characters in the current draft of ISO 14651 to the SC22/WG20 e-mail reflector. Although the standard is out for ballot, this paper is intended to answer the comments of the US National Body as given in SC22/WG20 N527R, and to provide information regarding the ordering of the Cyrillic characters given in ISO 14651.
I give below Johan van Wingen's note as the basis for the discussion.
On p. 5 of WG20 N 527R there is a comment from the US NB on Cyrillic character ordering. It contains some strange references. ISO/R9 is long outdated and unavailable anymore. The most recent edition is ISO 9:1995, and it has two tables, one for Slavic, one for non-Slavic. Also, "other sources" are mentioned without stating which.
But the problem the US identified is real. I somewhat overlooked the issue in 14651 up to now. What should be the basis for a Cyrillic general order? The tables for Cyrillic in 10646-1 (which are largely my work) were never intended to specify an order (and are just unsuitable). There is no source in literature which gives an order for ALL Cyrillic letters. The tables in ISO 9 are incomplete, separate Slavic and non-Slavic, and contain several non-existing letters. The only ordered list I know of is in K. M. Musaev (p. 80-81), for both Slavic and non-Slavic (merged), but is also incomplete, has a few errors, and does not contain Serbian and Macedonian letters. Giljarevskij and Grivnin contain all Cyrillic characters, but lists them alphabetically for every language separately. None of these sources include historic characters. Thus a general order in 14651 is likely to be arbitrary, but should not conflict with the order on which the present sources agree. Anyway, a justification should be given in good scientific style, and a note be added saying: This order has been specially invented for the purpose of this standard. Even then, the order should reflect some sense. Why are the poor Abkhasian letters placed at the end of the list? Why are the four pre-1917 letters mixed up with the historic ones, and not placed at their pre-1917 position? Apart from this, non-slavic Cyrillic has a large number of digraphs, trigraphs and tetragraphs which take a place in the alphabet (see Musaev and Giljarevskij and Grivnin). This is ignored in 14651.
Best regards from J. W. van Wingen
The order given fo the Cyrillic script in ISO 14651 are based on the same principles which are used to order the Latin script. The chief reason for this is that the Cyrillic script is used for a great many languages, each with its own unique ordering. It is impossible to reconcile all of these orderings, so a generic ordering is given which can be tailored to meet the needs of individual languages. The generic ordering does not favour any particular language, but is based on the graphic form of the character.
The basic order of the Cyrillic script is taken to be that of the prototypical alphabet, Old Church Slavonic. This order is given in the following works:
Faulmann, Carl. 1990 (1880). Das Buch der Schrift. Frankfurt am Main: Eichborn. ISBN 3-8218-1720-8 (ger)
Haarmann, Harald. 1990. Universalgeschichte der Schrift. Frankfurt/Main; New York: Campus. ISBN 3-593-34346-0 (ger)
Istrin, Viktor Aleksandrovich. 1963. 1100 let slavjanskoj azbuki, 863-1963. Moskva: Nauka. (rus)
Leskien, A. 1922. Handbuch der altbulgarischen (altkirchenslavischen) Sprache. Heidelberg: Carl Winter. Pp. 4-5. (ger)
Safarнk, Pavel Josef. 1853. Pamбtky hlaholskйho pнsmennictvн. Prag: Haase. P. 6. (cze)
Xaburgaev, G. A. 1986. Staroslavjanskij jazyk. Moskva: Prosveshchenie. Back flyleaf. (rus)
The order given for Old Church Slavonic in the ALA-LC Romanization Tables differs from these sources, which should be considered more authoritative.
Other works consulted:
Akademija Nauk SSSR: Kazanskij institut jazyka, literatury i istorii. 1966. Tatarsko-ruskij slovar' = Tatarзa-rusзa syzlek . Moskva: Sovetskaja enciklopedija. (tat rus)
Axmerov, K. Z. (Axmerov, Q. Z.), et al. 1958. Bashkirsko-russkij slovar' = Bashqortsa-russa hьрlek. Moskva: Gosudarstvennoe izdatel'stvo inostrannyx i nacional'nyx slovarej. (bak rus) Azymov, P., Berdiev, R., & Sejisov, Y. 1990. Xarplyk 1-ndzhi klas (alty jashlyrlar ьзin). Ashgabat: Magaryf. (tuk)
Azizbekov, X. A. (Дzizbдjov, X. Д.) 1965. Azerbajdzhansko-russkij slovar' = Azдrbajcanca-rusca lьghдt. Baku: Azerbajdzhanskoe gosudarstvennoe izdatel'stvo. (aze rus)
Baskakova, N. A. (Baskakovyс, N. A.) et al.Tьrkmenзe-rusзa sцzlьk = Turkmensko-russkij slovar'. Moskva: Sovetskaja иnciklopedija. (tuk rus)
Зoзua, Andrej M. 1970. Ap)sua byzshдa: anban ash't,ax' izyp)x'o ashek/u. Akua: Alashara. (abk)
Judaxin, K. K. 1965. Kyrgyzзa-orusзa sцzdьk = Kirgizsko-russkij slovar'. Moskva: Sovetskaja иnciklopedija. (kir rus)
Musaev, Kenesbaj Musaevich. 1965. Alfavity jazykov narodov SSSR. Moskva: Nauka. (rus)
Pal'mbax, A. A. 1955. Tuvinsko-russkij slovar' = Tyva-orus slovar'. Moskva: Gosudarstvennoe izdatel'stvo inostrannyx i nacional'nyx slovarej. (tyv rus)
Slepcov, P. A. 1972. Jakutsko-russkij slovar' = Saxalyy-nuuззalyy tyld'yt. Moskva: Sovetskaja иnciklopedija. (yakut rus)
Although Cyrillic letters are, by convention, considered separate at level 1 of the sort, nevertheless, in order to be consistent with the ordering of the Latin and Greek scripts in 14651, similar characters are ranked at level 1 as though they had accents at level 3. The ranking gives us a logical, predictable order, not unlike that used for Latin and Greek, and is in accord with what information we have about ordering Cyrillic in general. The order of the "accents" given for Cyrillic is as follows:
PECULIAR, ACUTE, BREVE, DIAERESIS, DOUBLE ACUTE, MIDDLE TILDE, BAR, VERTICAL BAR, DESCENDER, LEFT DESCENDER, MACRON, TOPBAR, VARIANT, MIDDLE HOOK, YOTIFIER.
This order is not identical to that found in Musaev, but it is fairly close (here BREVE precedes DIAERESIS but in Musaev DIAERESIS precedes BREVE), and Musaev is not intended to be normative. The order is isomorphic to the order of general accents already found among the collating symbols in ISO 14651, where BREVE precedes DIAERESIS. It need hardly be said that the order of accents is a fairly arbitrary thing in 14651.
To answer some specific questions:
Abxazian letters should be sorted as derived letters (following IE and O), not as new basic letters (added to the end of the alphabet). This is in accordance with Abxazian practice.
Johan asked: "Why are the four pre-1917 letters mixed up with the historic ones, and not placed at their pre-1917 position?" If he means the four letters used in the Russian language before 1917 (BYELORUSSIAN-UKRAINIAN I, YAT, FITA, IZHITSA), then the answer is that Russian was not taken as the base, but Old Church Slavonic, as this is more generic and language-independent. Nevertheless, upon checking Faulmann 1880 I find that the relative order of I,... BYELORUSSIAN-UKRAINIAN I,... SOFT SIGN, YAT,... YA... FITA, IZHITSA is indeed the order used in Russian, so the ordering here is conformant with the practice of that large and important language.
The digraphs, trigraphs, and tetragraphs given in Musaev can be tailored in implementations of 14651, but are outside the scope of the basic table.
The principles used to sort Cyrillic in 14651 can be seen to be the same as those employed for Latin, a script also used for many languages. No one language is favoured over any other. There is to my knowledge only one possible outstanding issue:
The order where DZE precedes ZE follows the historical order of these letters in Old Church Slavonic where ZELO precedes ZEMLJA. DZE is identical with ZELO and ZE is identical with ZEMLJA. When the characters are given their numeric values (as they often are when used as dates in books), ZELO is 6 (8 in Glagolitic) and ZEMLJA is 7 (9 in Glagolitic). In Old Ukrainian ZELO precedes ZEMLJA. In Old Romanian SALO precedes SEMLIA. The open issue is this: I have heard, but can neither confirm nor deny with the sources I have to hand, that in the Macedonian language ZE precedes DZE. The default order of 14651 where DZE precedes ZE can be tailored for modern Macedonian needs just as the order of ЖШЕ must be tailored for in Danish. However I felt it my duty to bring attention to the point. There is a lot of literature using the Cyrillic script which would require the Old Church Slavonic ordering, and I felt that it was best (and less contentious) to stick to the layout of the prototypical Cyrillic script for ISO 14651.
| Evert DELOOF-SYS|
Local time: 09:41
Native speaker of: Dutch, Flemish
PRO pts in pair: 55