Archive Liste Typographie
Message : Re: FW: Localising Unicode character names (Alain LaBonté ) - Mercredi 09 Décembre 1998 |
Navigation par date [ Précédent Index Suivant ] Navigation par sujet [ Précédent Index Suivant ] |
Subject: | Re: FW: Localising Unicode character names |
Date: | Wed, 09 Dec 1998 15:50:12 -0500 |
From: | Alain LaBonté <alb@xxxxxxxxxxxxxx> |
A 14:03 98-11-27 -0000, Breen McInerney a écrit : >Alain, > >Murray gave me your name as a contact and someone who may be able to help on >below. > >Thanks. >Breen > >> -----Original Message----- >> From: Murray Sargent >> Sent: Wednesday, November 25, 1998 9:00 PM >> To: Breen McInerney >> Subject: RE: Localising Unicode symbol descriptions ... >> >> I don't believe anyone on the Unicode Technical Committee is interested in >> localised Unicode names. But Alain LaBonté (Alain [alb@xxxxxxxxxxxxxx]) >> has dealt with the issue for French, and might have some answers for your >> questions. >> >> Thanks >> Murray >> >> -----Original Message----- >> From: Breen McInerney >> Sent: Wednesday, November 25, 1998 3:08 AM >> To: Murray Sargent >> Subject: Localising Unicode symbol descriptions ... >> >> Hi Murray, >> >> I have seen your name on various unicode mailing lists and hope you know >> someone who may be able to advise on below. I am the program manager >> working on the localisation of Spanish Windows 2000. In the product there >> are there are 6543 entries which describe the various unicode symbols - >> strings can be seen in charmap. This industry standard is not translated >> officially into Spanish although many of the symbols are found in >> dictionaries and reference books. >> >> Based on the availability of that terminology, it was decided to localise >> this subset of the Unicode range: C0 Controls and Basic Latin; C1 Controls >> and Latin-1 Supplement, Latin Extended-A, Latin Extended-B, IPA >> Extensions, Spacing Modifier Letters, Combining Diacritical Marks, Greek, >> Cyrillic, Hebrew, Arabic, Latin Extended Additional, Greek Extended, >> General Punctuation, Superscripts and Subscripts, Currency Symbols, >> Combining Diacritical Marks for Symbols, Letterlike Symbols, Number Forms, >> Arrows, Mathematical Operators, Miscellaneous Technical, Control Pictures, >> Optical Character Recognition, Enclosed Alphanumerics, Box Drawing, Block >> Elements, Geometric Shapes, Miscellaneous Symbols, Dingbats, Alphabetic >> Presentation Forms, Arabic Presentation Forms-A, Combining Half Marks, >> Small Form Variants, Arabic Presentation Forms-B, Halfwidth and Fullwidth >> Forms, Specials >> >> For practical reasons it was also decided not translate the Asian and some >> middle East(?) ones: Armenian, Bengali, Bopomofo, Cicled Katakana, Coptic, >> Devanagari, Georgian, Gurmukhi, Gujarati, Halfwidth Hangul, Halfwidth >> Katakana, Hangul, Hiragana, Kannada, Katakana, Lao, Malayalam, Oriya, >> Tamil, Telugu, Thai, y Tibetan.) >> >> On the part that has been translated localisers\IQA are finding it >> difficult to assign the best translation and often are not sure if they >> have the correct one. >> Quote from IQA (Internal Language Quality Assurance) "After reviewing the >> symbols in the Character Map myself, I found quite a few changes, I was >> using a mixture of my knowledge of phonetic symbols, ancient Greek >> alphabet, Latin metrical, and quite a lot of books on printing, but still >> I couldn't manage to solve Arabic, Cirylic, and other alphabet symbols." >> >> The French team have been able to reference a previously translated >> standard for the French language which was a big help. Nothing equivalent >> seems to exist for Spanish. >> By localising for NT5 we are creating our own Microsoft standard which may >> not comply to others who may have done work on this before for Spanish. >> Do you have any contacts in the unicode org which may be able to advise on >> above ? know if there is a standard already ? or anyone who would be >> interested in reviewing the localisation done already. >> >> Any help\advise much appreciated. >> Breen [Alain] This is a topic that is important to us. You will probably have had a look at the web site: http://babel.alis.com:8080/codage/iso10646/index.html which lists the French character names of the UCS edition of 1993. Since that time we updated the list up to amendment 5 of the UCS (Unicode 2 is at amendment 7 level, if my memory is good) and we were wishing to publish the French version of ISO/IEC 10646 in synchronization with the next English version to be "crystallized" next march after twenty-something amendments. Some people in AFNOR put inappropriate breaks though in saying that since CEN had refused to adopt the UCS as an European standard (they are making subsets suing UCS coding), they interpreted this as meaning that the UCS was rejected in Europe, a cold shower, only temporary hopefully (the time that they understand that the UCS is desired in Europe, of course), as this is to the opinion of many, very important for users and your project gives an idea of how this could also be commercially important. So if AFNOR is not fast enough, we intend to do it in Québec anyway (it is already on the web for the majority of names). Story: In 1995, Canada and France (backed by Ireland) made a campaign to make sure that [English] names were not to be *the* machine identifiers, as standards in ISO can be published in English, French and Russian. We succeeded to have adopted the idea that only numerical UCS ids (of the form "U[xyyy]zzzz" be the mandatory *anchors* for character names and coding between different coding standards, different versions in different languages of coding standards and different private character sets. That said, ISO/IEC 14755 (Input methods to enter UCS characters with the help of nay keyboard) recommends, whenever necessary to present character names to users (and it is also highly recommended for feedback to end-users, in particular for different characters displayed with the same shape), that these be presented in the user's language. For French, this is already possible. I understand that Sweden also developed a limited version of the UCS Swedish character names and so did Ireland in Gaelic. It would be interesting to have a multilingual database of those names. So far, as I said, the full list of French names are pretty well established, although still unpublished as an international standard (but referenced by a lot of users on the web in practice). They are also used normatively in recent French versions of the ISO/IEC 8859 series (part 14 and 15 [this one similar to Latin 1 except for 8 characters, including the EURO SIGN] in particular, which are about to be published as international standards). I have to slightly correct an information given to you by Murray to the effect that there are no UTC members interested in localizing French character names: Michel Suignard (Microsoft Redmond), whom you might know, is very active in ISO/IEC JTC1 SC2 and is a member of UTC. He has pioneered the first list of French character names of the UCS when he was working for Microsoft in France, and was doing it on a benevolent basis. So this might not be widely known in UTC, but I have to be just in favour of him, and rectify the information, as he has considerably contributed in this dossier. If I can be of any help in providing data files for what we have and which is already public, and if you can contribute to make this list improved by any means, we would certainly be very pleased to collaborate with you [in the measure that no paying copyright is placed on any name] (and we are also very interested in a multilingual list of characters although outside of French, Swedish and Gaelic, the work has not begun to my knowledge) . Don't hesitate to recontact me if I can give further details that I would have omitted to consider to deal with your request. With my best Regards. Alain LaBonté Québec cc interested parties (hoping that they will agree with this good will offer of collaboration)
- Re: FW: Localising Unicode character names, Alain LaBonté <=