Archive Liste Typographie
Message : Pour un codage universel dans les courriels (ou ailleurs) (Alain LaBonté ) - Dimanche 18 Octobre 1998 |
Navigation par date [ Précédent Index Suivant ] Navigation par sujet [ Précédent Index Suivant ] |
Subject: | Pour un codage universel dans les courriels (ou ailleurs) |
Date: | Sun, 18 Oct 1998 04:55:45 +0200 |
From: | Alain LaBonté <alb@xxxxxxxxxxxxxx> |
Le test qui suit est intéressant à juste titre. Le codage UTF-8 devrait passer partout (i.e. sans filtrage) où le mode 8 bits est activé. Cela étant, il reste que nous sommes loin de la coupe aux lèvres en ce qui concerne son décodage (je reçois 5 sur 5, mais non décodé (; ). Selon MIME, ce texte-test devrait être balisé en UTF-8 (il l'était à la source, avec les en-têtes suivantes : Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Bon, Eudora Pro 3 ne reconnaît pas ces en-têtes mais il me les a affichées [bien qu'il ne sache pas les reproduire quand je réachemine le courrier -- je ne sais pas si Eudora Pro 4 le fait, mais je sais qu'Outlook en est capable, encore faut-il le lui dire !] et il n'a pas filtré les octets (sous Eudora-Mac, avec les versions actuelles, ou sous OS/2, ce sera une autre histoire, car tous ces octets seront filtrés et ne passeront pas [rançon de devoir modifier tous les octets à chaque fois à cause d'un codage non standard], ni ne seront réacheminables, c'est sûr -- ce qui est plus ou moins grave à l'heure actuelle, puisque de toute façon, peu de logiciels savent décoder le résultat -- mais ça viendra certainement un jour). Comme le mode UTF-8 peut être reconnu automatiquement de manière relativement aisée, il y aurait sans doute lieu d'intégrer des automatismes dans la plupart des logiciels de courrier élecronique. Alain LaBonté Tel Aviv P.-S. : Une réponse suit le courrier cité (il y est d'ailleurs discrètement mention du soutien actuel, d'ores et déjà, du latin 9 [8859-15], par un système commercial) _____________________________________________ From: Markus Kuhn <Markus.Kuhn@xxxxxxxxxxxx> >Reply-To: unicode@xxxxxxxxxxx >To: Unicode List <unicode@xxxxxxxxxxx> >Cc: welch@xxxxxxx, mutt-dev@xxxxxxxxxx >Date: Sat, 17 Oct 1998 02:12:30 -0700 (PDT) >Subject: Who can read UTF-8 in email today? > >I'd be very curious about which email software out there is already >supporting UTF-8 encoded plain-text files and can at least display >them correctly. > >If you can read a significant fraction of the following Unicode test >characters on your email system, please let me know. I will write a list >of UTF-8 enabled email products and post it. > >If you can't read them, then please try to get in touch with the >developer of your favourite email software and try to get them >interested in Unicode and UTF-8. Make them aware of the UTF-8 definition >at > > ftp://ftp.funet.fi/mirrors/nic.nordu.net/rfc/rfc2044.txt > >and the list of links to various free Unicode fonts at > > http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html > >Hopefully Unicode/UTF-8 will soon be sufficiently commonplace in email >software that we can start using it at least on the Unicode mailing >list. > >The following list of test characters is the current repertoire of the >ISO 10646-1 X11 fixed font available from the above URL: > >Basic Latin (U+0000-U+007F): > > !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ >`abcdefghijklmnopqrstuvwxyz{|}~ > >Latin-1 Supplement (U+0080-U+00FF): > > ¡¢£¤¥¦§¨©ª«¬Â®¯°±²³´µ¶·¸¹º»¼½¾¿Ã?Ã?Ã?Ã?Ã?à ?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã?Ã? >à áâãäåæçèéêëìÃîïðñòóôõö÷øùúûüýþÿ > >Latin Extended-A (U+0100-U+017F): > >Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä?Ä Ä¡Ä¢Ä£Ä¤Ä ¥Ä¦Ä§Ä¨Ä©ÄªÄ«Ä¬ÄĮįİıIJijĴĵĶķĸĹĺĻļĽľĿ >Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?Å?ŠšŢţŤŠ¥Å¦Å§Å¨Å©ÅªÅ«Å¬ÅŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ > >Latin Extended-B (U+0180-U+024F): > >Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ?Æ Æ¡Æ¢Æ£Æ¤Æ ¥Æ¦Æ§Æ¨Æ©ÆªÆ«Æ¬ÆÆ®Æ¯Æ°Æ±Æ²Æ³Æ´ÆµÆ¶Æ·Æ¸Æ¹ÆºÆ»Æ¼Æ½Æ¾Æ¿ >Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç?Ç Ç¡Ç¢Ç£Ç¤Ç¥Ç¦Ç§Ç ¨Ç©ÇªÇ«Ç¬ÇǮǯǰDZDzdzǴǵǺǻǼǽǾǿÈ?È?È?È?È?È?È? >È?È?È?È?È?È?È?È?È?È?È?È?È?È?È?È?È? > >IPA Extensions (U+0250-U+02AF): > >É?É?É?É?É?É?É?É?É?É?É?É?É?É?É?É?É É¡É¢É£É¤É¥É¦É§É¨É©ÉªÉ«É¬ÉÉ®É¯É°É±É²É³É´É µÉ¶É·É¸É¹ÉºÉ»É¼É½É¾É¿Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê? >Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê?Ê Ê¡Ê¢Ê£Ê¤Ê¥Ê¦Ê§Ê¨ > >Spacing Modifier Letters (U+02B0-U+02FF): > >ʰʱʲʳʴʵʶʷʸʹʺʻʼʽʾʿË?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë ?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë?Ë Ë¡Ë¢Ë£Ë¤Ë¥Ë¦Ë§Ë¨Ë© > >Greek (U+0370-U+03FF): > >ʹ͵ͺ;Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?Î?ΠΡΣΤΥΦΧΠ¨Î©ÎªÎ«Î¬ÎήίΰαβγδεζηθικλμνξοÏ?Ï?Ï? >Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï?Ï Ï°Ï±Ï²Ï³ > >Cyrillic (U+0400-U+04FF): > >Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?Ð?РСТУФХЦР§Ð¨Ð©ÐªÐ«Ð¬ÐЮЯабвгдежзийклмнопÑ?Ñ? >Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ?Ñ Ñ¡Ò?Ò?Ò?Ò?Ò¢Ò£Ò¤Ò ¥ÒªÒ«Ò¬ÒÒ®Ò¯Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó?Ó Ó¡Ó¢ >Ó£Ó¤Ó¥Ó¦Ó§Ó¨Ó©ÓªÓ«Ó®Ó¯Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹ > >Armenian (U+0530-U+058F): > >Ô±Ô²Ô³Ô´ÔµÔ¶Ô·Ô¸Ô¹ÔºÔ»Ô¼Ô½Ô¾Ô¿Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ ?Õ?Õ?Õ?Õ?Õ?Õ?Õ?Õ¡Õ¢Õ£Õ¤Õ¥Õ¦Õ§Õ¨Õ©ÕªÕ«Õ¬ÕÕ®Õ¯Õ°Õ±Õ²Õ³ >Õ´ÕµÕ¶Õ·Õ¸Õ¹ÕºÕ»Õ¼Õ½Õ¾Õ¿Ö?Ö?Ö?Ö?Ö?Ö?Ö?Ö?Ö? > >Hebrew (U+0590-U+05FF): > >×?×?×?×?×?×?×?×?×?×?×?×?×?×?×?×?× ×¡×¢×£×¤×¥×¦×§×¨×©×ª > >Georgian (U+10A0-U+10FF): > >á??á??á??á??á??á??á??á??á??á??á??á??á??á??á??á??á? á?¡á?¢á?£á?¤á?¥á?¦á?§á?¨ á?©á?ªá?«á?¬á?á?®á?¯á?°á?±á?²á?³á?´á?µá?¶á?» > >Latin Extended Additional (U+1E00-U+1EFF): > >á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸?á¸? á¸?á¸?á¸?á¸?á¸?á¸?á¸?ḠḡḢḣḤḥḦḧḨḩḪḫḬá¸á¸®á¸¯á¸°á¸±á ¸²á¸³á¸´á¸µá¸¶á¸·á¸¸á¸¹á¸ºá¸»á¸¼á¸½á¸¾á¸¿ >á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹?á¹? á¹?á¹?á¹?á¹?á¹?á¹?á¹?ṠṡṢṣṤṥṦṧṨṩṪṫṬá¹á¹®á¹¯á¹°á¹±á ¹²á¹³á¹´á¹µá¹¶á¹·á¹¸á¹¹á¹ºá¹»á¹¼á¹½á¹¾á¹¿ >áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº?áº? áº?áº?áº?Ỳỳ > >General Punctuation (U+2000-U+206F): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â? â?¡â?¢ â?£â?¤â?¥â?¦â?§â?°â?²â?³â?´â?µâ?¶â?·â?¹â?ºâ?»â?¼â?½â?¾â?¿â??â??â??â??â??â??â?? > >Superscripts and Subscripts (U+2070-U+209F): > >â?°â?´â?µâ?¶â?·â?¸â?¹â?ºâ?»â?¼â?½â?¾â?¿â??â??â??â??â??â??â??â??â??â??â??â?? â??â??â?? > >Currency Symbols (U+20A0-U+20CF): > >â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?«â?¬ > >Letterlike Symbols (U+2100-U+214F): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â? â?¢â?£â?¤â?¥â?¦â?§â?ªâ?«â?®â?°â?± â?²â?´â?µâ?¶â?·â?¸ > >Number Forms (U+2150-U+218F): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¨â?©â?ªâ?¬â?â?® â?¯â?°â?±â?²â?³â?´â?µâ?¸â?¹â?ºâ?¼â?½â?¾â?¿â??â?? > >Arrows (U+2190-U+21FF): > >â??â??â??â??â??â??â??â??â??â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?ªâ?«â?¬ â?¯â?°â?±â?²â?³â?´â?µâ?¶â?·â?¸â?¹â?ºâ?»â?¼â?½â?¾â?¿â??â??â??â??â??â??â??â??â ??â??â??â??â??â??â??â??â??â??â??â??â??â?? >â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?ª > >Mathematical Operators (U+2200-U+22FF): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â?? â??â??â??â??â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?ªâ?«â?®â?±â?´â?µâ?¶â?·â ?¸â?¹â?ºâ?»â?¼â?½â?¾â?¿â??â??â??â??â??â?? >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â?? â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?ªâ?«â?¬â?â?®â?¯â?°â?±â?²â?³â?´â?µâ?¶â?·â ?¸â?¹â?ºâ?»â?¼â?½â?¾â?¿â??â??â??â??â??â?? >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â? â?¡â?¢ â?£â?¤â?¥â?¦â?§â?¨â?©â?«â?²â?³â?´â?µâ?¶â?·â?¸â?¹â??â??â??â?ªâ?«â?¬â?â?®â?¯â ?°â?± > >Miscellaneous Technical (U+2300-U+23FF): > >â??â??â??â??â??â??â??â??â? â?¡â?©â?ª > >Control Pictures (U+2400-U+243F): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â?? â??â??â??â??â??â??â??â? â?¡â?¢â?£â?¤ > >Optical Character Recognition (U+2440-U+245F): > >â??â??â??â??â??â?? > >Box Drawing (U+2500-U+257F): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â?? â??â??â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?ªâ?«â?¬â?â?®â?¯â?°â?±â?²â?³â ?´â?µâ?¶â?·â?¸â?¹â?ºâ?»â?¼â?½â?¾â?¿â??â?? >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â?? â??â??â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§â?¨â?©â?ªâ?«â?¬â?â?®â?¯â?°â?±â?²â?³â ?´â?µâ?¶â?·â?¸â?¹â?ºâ?»â?¼â?½â?¾â?¿ > >Block Elements (U+2580-U+259F): > >â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â?? > >Geometric Shapes (U+25A0-U+25FF): > >â? â?¡â?¢â?£â?ªâ?«â?¬â?â?®â?¯â?°â?±â?²â?³â?´â?µâ?¶â?·â?¸â?¹â?ºâ?»â?¼â?½â?¾ â?¿â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â ??â??â??â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¦â?§ >â?¨â?©â?ªâ?«â?¬â?â?®â?¯ > >Miscellaneous Symbols (U+2600-U+26FF): > >â??â??â??â??â??â?ºâ?»â?¼â?¿â??â??â??â? â?¡â?¢â?£â?¤â?¥â?¦â?©â?ªâ?«â?¬â?â?® â?¯ > >Dingbats (U+2700-U+27BF): > >â??â??â??â??â??â??â?? > >Alphabetic Presentation Forms (U+FB00-U+FB4F): > >ï¬?ï¬? > >Specials (U+FFF0-U+FFFF): > >� > > >Markus > >-- >Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK >email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/> __________________________________________ From: Ienup Sung <ienup.sung@xxxxxxxxxxx> >Reply-To: unicode@xxxxxxxxxxx >To: Unicode List <unicode@xxxxxxxxxxx> >Cc: welch@xxxxxxx, mutt-dev@xxxxxxxxxx >Date: Sat, 17 Oct 1998 13:25:33 -0700 (PDT) >Subject: Re: Who can read UTF-8 in email today? > >] Date: Sat, 17 Oct 1998 02:12:30 -0700 (PDT) >] From: Markus Kuhn <Markus.Kuhn@xxxxxxxxxxxx> >] Subject: Who can read UTF-8 in email today? >] To: Unicode List <unicode@xxxxxxxxxxx> >] Cc: welch@xxxxxxx, mutt-dev@xxxxxxxxxx >] >] I'd be very curious about which email software out there is already >] supporting UTF-8 encoded plain-text files and can at least display >] them correctly. > >DtMail at Solaris 2.6 can read majority of European scripts in UTF-8 >MIME charset and additionlly it supports other European MIME charsets like >ISO-8859-1, -2, ..., -10. This restriction on scripts is only in font, you >can have other code points and system won't reject or alter any of such codes. > >DtMail at Solaris 2.7 can read almost all characters in the email you sent >and much more since we added CJK, Thai, Arabic, Hebrew fonts and so on. >It supports MIME charsets like UTF-8, UTF-7, ISO-8859-1 ~ -10, ISO-8859-15, >koi8-r, EUC-KR, eucJP, ISO-2022-JP, ISO-2022-KR, ISO-2022-CN, US-ASCII, ... >Basically it is able to correctly encode/decode almost all major MIME >charset'd emails and show messages correctly. The DtMail has a menu option >that you can select which outgoing message's MIME charset you want to use >instead of automatic default setting supplied one too. > >I also placed a couple of screen snapshots that shows Solaris 2.7 dtmail in >American English Unicode locale and also Japanese Unicode locale at: > > http://members.tripod.com/~ienup/dtmail1.gif (en_US.UTF-8 dtmail) > http://members.tripod.com/~ienup/dtmail2.gif (ja_JP.UTF-8 dtmail) > >With regards, > >Ienup
- Pour un codage universel dans les courriels (ou ailleurs), Alain LaBonté <=