UTF-8 encoded sample plain-text file > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Markus Kuhn [Èma³kŠs kuÐn] <http://www.cl.cam.ac.uk/~mgk25/>  2002-07-25 CC BY The ASCII compatible UTF-8 encoding used in this plain-text file is defined in Unicode, ISO 10646-1, and RFC 2279. &e &e &e On 2019-12-06 I (Jon Jensen) converted this original document located at: https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt to UTF-16 big endian encoding to test server & client (browser) handling and to compare response size difference (before I added these comments): gzip compressed: 6107 bytes for UTF-16BE vs. 6036 bytes for UTF-8. uncompressed: 15254 bytes for UTF-16BE vs. 14058 bytes for UTF-8. Because the extensive non-ASCII Unicode characters in this document represent a pessimal case for UTF-8, the UTF-16 file size is much closer to UTF-8 than it would be for most common texts. &e &e &e Using Unicode/UTF-8, you can write in emails and source code things such as Mathematics and sciences: ". E"Åda = Q, n !’ ", " f(i) = " g(i), #§#¡#›% %%%%%%#ž#¤#« #ª#¢#œ%a²+b³ #Ÿ#¥#ª "x"!: #x# = "# "x# , ± "' ¬² = ¬(¬± "( ²), #ª#¢#œ%%%%%% #Ÿ#¥#ª #ª#¢#œ#· c ˆ #Ÿ#¥#ª ! "† ! € "‚ !$ "‚ ! "‚ ! "‚ !, #¨#¢#œ #Ÿ#¥#¬ #ª#¢#œ " #Ÿ#¥#ª "¥ < a "` b "a c "d d "j "¤ !Ò ('æA'ç !Ô 'êB'ë), #ª#¢#œ #² #Ÿ#¥#ª #ª#¢#œ #³a q-b q#Ÿ#¥#ª 2H ‚ + O ‚ !Ì 2H ‚O, R = 4.7 k©, # 200 mm #©#£#i=1 # #¦#­ Linguistics and dictionaries: ði 1ntYÈnæƒYnYl fYÈn[t1k YsoŠsiÈe1ƒn Y [ÈpsilTn], Yen [j[n], Yoga [ÈjoÐgQ] APL: ((V#sV)=#s#tV)/V!,V #7!#s!’#t"""ƒ >#N#U# Nicer typography in plain text files: %T%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%W %Q %Q %Q " single  and double  quotes %Q %Q %Q %Q " Curly apostrophes: We ve been here  %Q %Q %Q %Q " Latin-1 apostrophe and accents: '´` %Q %Q %Q %Q " deutsche  Anführungszeichen  %Q %Q %Q %Q " , !, 0, ", 3 4, , "5/+5, !", & %Q %Q %Q %Q " ASCII safety test: 1lI|, 0OD, 8B %Q %Q %m%%%%%%%%%%n %Q %Q " the euro symbol: % 14.95 ¬ % %Q %Q %p%%%%%%%%%%o %Q %Z%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%] Combining characters: STARG› TE SG-1, a = v = r, a Ñ "¥ b Ñ Greek (in Polytonic): The Greek anthem: £r ³½ÉÁw¶É Àx Ät½ ºyÈ· Ä¿æ ÃÀ±¸¹¿æ Ät½ ÄÁ¿¼µÁu, Ãr ³½ÉÁw¶É Àx Ät½ DÈ· À¿z ¼r ²w± ¼µÄÁqµ¹ Ät ³Æ. ¿‘À¿ Äp ºyºº±»± ²³±»¼s½· Äö½ þ•»»u½É½ Äp 1µÁq º±v Ãp½ ÀÁöı ½´Áµ¹É¼s½· DZÖÁµ, f DZÖÁµ, ¿•»µÅ¸µÁ¹q! From a speech of Demosthenes in the 4th century BC: ŸPÇv ıPÄp À±ÁwÃııw ¼¿¹ ³¹³½}úµ¹½, f ½´Áµ ¿‘¸·½±Ö¿¹, Eı½ Ä¿ µ0 Äp ÀÁq³¼±Ä± À¿²»sÈÉ º±v Eı½ ÀÁx Ä¿z »y³¿Å ¿S º¿{ɇ Ä¿z ¼r½ ³pÁ »y³¿Å ÀµÁv Ä¿æ Ĺ¼ÉÁuñø±¹ ¦w»¹ÀÀ¿½ AÁö ³¹³½¿¼s½¿ÅÂ, Äp ´r ÀÁq³¼±Ä¿ µ0 Ä¿æÄ¿ ÀÁ¿uº¿½Ä±, eø¿ EÀÉ ¼t Àµ¹Ãy¼µ¸¿ ±PÄ¿v ÀÁyĵÁ¿½ º±ºö Ãºsȱø±¹ ´s¿½. ¿P´s½ ¿V½ »»¿ ¼¿¹ ´¿º¿æù½ ¿1 Äp Ä¿¹±æı »s³¿½Äµ " Ät½ QÀy¸µÃ¹½, ÀµÁv ' ²¿Å»µ{µÃ¸±¹, ¿PÇv Ät½ ¿Vñ½ À±Á¹ÃÄq½Äµ Q¼Ö½ ¼±ÁÄq½µ¹½. ³| ´s, EĹ ¼s½ À¿Ä¿ ¾Æ½ ÄÇ Ày»µ¹ º±v Äp ±QÄƠǵ¹½ ÃƱ»ö º±v ¦w»¹ÀÀ¿½ Ĺ¼ÉÁuñø±¹, º±v ¼q»¿ ºÁ¹²ö ¿6´±‡ À¿ ¼¿æ ³qÁ, ¿P Àq»±¹ ³s³¿½µ½ ıæÄ¿ ¼ÆyĵÁ±‡ ½æ½ ¼s½Ä¿¹ ÀsÀµ¹Ã¼±¹ Ŀ渿 1º±½x½ ÀÁ¿»±²µÖ½ !¼Ö½ µ6½±¹ Ät½ ÀÁ}Ä·½, EÀÉ Ä¿z Ãż¼qÇ¿Å Ã}ÿ¼µ½. p½ ³pÁ Ä¿æÄ¿ ²µ²±wÉ QÀqÁ¾Ã, Äyĵ º±v ÀµÁv Ä¿æ Äw½± Ĺ¼ÉÁuõıw Ĺ º±v C½ ÄÁyÀ¿½ ¾sÃı¹ ú¿ÀµÖ½‡ ÀÁv½ ´r Ät½ ÁÇt½ @Á¸ö QÀ¿¸sø±¹, ¼qı¹¿½ !³¿æ¼±¹ ÀµÁv ÄƠĵ»µÅÄÆ A½Ä¹½¿æ½ À¿¹µÖø±¹ »y³¿½. ”·¼¿Ã¸s½¿ÅÂ, “ý ¿Ÿ»Å½¸¹±ºx Georgian: From a Unicode conference invitation: Ò×îÝÕ× ÐîÚÐÕÔ ÒÐØÐàÝ× àÔÒØáâàÐêØÐ Unicode-Øá ÛÔÐ×Ô áÐÔà×ÐèÝàØáÝ ÙÝÜäÔàÔÜêØÐÖÔ ÓÐáÐáìàÔÑÐÓ, àÝÛÔÚØê ÒÐØÛÐà×ÔÑÐ 10-12 ÛÐàâá, å. ÛÐØÜêèØ, ÒÔàÛÐÜØÐèØ. ÙÝÜäÔàÔÜêØÐ èÔðÙàÔÑá Ôà×ÐÓ ÛáÝäÚØÝá ÔåáÞÔàâÔÑá ØáÔ× ÓÐàÒÔÑèØ àÝÒÝàØêÐÐ ØÜâÔàÜÔâØ ÓÐ Unicode-Ø, ØÜâÔàÜÐêØÝÜÐÚØÖÐêØÐ ÓÐ ÚÝÙÐÚØÖÐêØÐ, Unicode-Øá ÒÐÛÝçÔÜÔÑÐ ÝÞÔàÐêØãÚ áØáâÔÛÔÑáÐ, ÓÐ ÒÐÛÝçÔÜÔÑØ× ÞàÝÒàÐÛÔÑèØ, èàØäâÔÑèØ, âÔåáâÔÑØá ÓÐÛãèÐÕÔÑÐáÐ ÓÐ ÛàÐÕÐÚÔÜÝÕÐÜ ÙÝÛÞØãâÔàãÚ áØáâÔÛÔÑèØ. Russian: From a Unicode conference invitation: 0@538AB@8@C9B5AL A59G0A =0 5AOBCN 564C=0@>4=CN >=D5@5=F8N ?> Unicode, :>B>@0O A>AB>8BAO 10-12 <0@B0 1997 3>40 2 09=F5 2 5@<0=88. >=D5@5=F8O A>15@5B H8@>:89 :@C3 M:A?5@B>2 ?> 2>?@>A0< 3;>10;L=>3> =B5@=5B0 8 Unicode, ;>:0;870F88 8 8=B5@=0F8>=0;870F88, 2>?;>I5=8N 8 ?@8<5=5=8N Unicode 2 @07;8G=KE >?5@0F8>==KE A8AB5<0E 8 ?@>3@0<<=KE ?@8;>65=8OE, H@8DB0E, 25@AB:5 8 <=>3>O7KG=KE :><?LNB5@=KE A8AB5<0E. Thai (UCS Level 2): Excerpt from a poetry on The Romance of The Three Kingdoms (a Chinese classic 'San Gua'): [----------------------------|------------------------] O AH4.1H@*7H-!B#!A**1@' #0@(-9J9I6IC+!H *4*-)1#4"LH-+I2A%1D *--LD #IBH@%2@21  2 #17-15@G5H6H I2@!7-6'4#4@G1+2 B.4K@#5"11H'+1'@!7-!2 +!2"0H2! 1H'1'*31 @+!7-1D*D%H@*7-2@+2 #1+!2H2@I2!2@%"-2*1 H2"-I--8I"8A"C+IA1 C I*2'1I@G ' 7H 'C %1%4 8"8"5%1H-@+8  H2-2@(#4+2I2#I-D+I I-##2H21##%1" $E+2C#I3 99I##%1L / (The above is a two-column text. If combining characters are handled correctly, the lines of the second column should be aligned with the | character above.) Ethiopian: Proverbs in the Amharic language: 0í  ís(5 • %  í¨05b e « ¥•ð ct `F b  % ëdq A%“ Íb ð `  Ed cí # •#u `ðÍb è M Ès `Ed  ís=b  í% `` óË psb 2p(  íð( b @5 `@5e Õ•A  `¥ ) íó b õ- bëe-  •`3 ë5-b 0Í ¥•ðdq ¥• ¥•ð  (dq  ípóð-b ¥ Ü- è¨HpÍ•  .. 3íØ Í  íõ-b è(du  ce bëéu í5E cëéu ë  Eb %+ ¨Msu  •  Ksub Ócí ð*ë èÍe  •õ íÞ íÞ+ b è¥5   ) « è +  ) Ë-«b p•  bpI p 6 cIb Èó - b• (-5  u 0Íb ¥ -• `M+=  ­ Ø- b Runes: »Ö ³¹«¦ ¦«Ï »Ö Ò¢ÞÖ ©¾ ¦«× Úª¾ÞÖ ¾©±¦¹Öª±Þ¢× ¹Á¦ ¦ª ¹Öå« (Old English, which transcribed into Latin reads 'He cwaeth that he bude thaem lande northweardum with tha Westsae.' and means 'He said that he lived in the northern land near the Western Sea.') Braille: (L(('( (<(( (M((((9(0( (c(( (M((((9 (:(( ((((( (( (((( (:( (9(2 (y(;( ( ( (( ((3(( (1(((('(; (((3( (9(((2 (y( (((( ( (; (( (( ( ((%(( (( (:(( (( (((+ ((9 (9( ( ((;((9( ((( (9( ( ((;(( (9( (%(((;((((;( ((( (9( (!( (( ( (3(((;(2 (N( ((((( (( (((+ ( ((2 (A(( (N( ((((((0( ((( ( (:(( (((( (%((( (0(a((((( ( (( (((9(9(( (( (!((( (( ((%( (( ( (((( (((2 (U(( (M((((9 (:(( (( (((( (( ( ((((($((( ((2 (M((( (J ((((0( ( ((( (( (((9 (9(( (J (((*( (( ( (9 (*( (((*((+((( (1(( (9(;( ( ( (((( ( (%((((9 (((( (((3( ( ((((($((( ((2 (J ( ( (#( ((('( (((2 (( (((+( ( (9(((( ( (( ((((( ( ( (( ( (($((( ( (( (9( (((((( (( (( ( (( ( (((( ((((;(9 ( (9( ((((((2 (C(%( (9( (:( (((( (( (3( ((( (( ((( ( ( ( (9( (( ( ( ((( ((( ( (9 (%((((((*(+ ((((( ()((( ((( (( ( (%(( ( (( (( (9( (J(3((((9(0( (((( ( (((2 (y(3 (:( (( (9(;(( ((( ((;( ( ( ( ( (( ((((((( (( ((((( ( ((((9( (9(( (M((((9 (:(( (( (((( (( ( ((((($((( ((2 (The first couple of paragraphs of "A Christmas Carol" by Dickens) Compact font selection example text: ABCDEFGHIJKLMNOPQRSTUVWXYZ /0123456789 abcdefghijklmnopqrstuvwxyz £©µÀÆÖÞßéöÿ       " & 0!"S`x~ ¬ ‘’“”©±²³´É 01234 """!"'"*"a" !‘!—!¨!»!ã %%<%T%X%‘%º&:&@ ûÿý$@ ‚ å„PÐ#NÐ1Ð Greetings in various languages: Hello world, š±»·¼sÁ± ºyüµ, 0³0ó0Ë0Á0Ï Box drawing alignment tests: %ˆ %‰ %T%P%P%f%P%P%W % %%%,%%% %m%%%,%%%n %m%%%,%%%n %%%%3%%% %%%% %w %{ %%/% % %0% %Š %q%r%q%r%s%s%s %Q% %%h%%%Q %%T%P%g%P%W% %%R%P%j%P%U% %%S%%A%%V% %% %%B%%% %%C%D% %v%<%t%z%K%x% %<%( %%K%% %‹ %r%q%r%q%s%s%s %Q%%r %q%%Q %%Q %Q% %% % %% %%Q % %Q% %% % %% % %E%F% %u %y %%7% %%8% %Œ %q%r%q%r%s%s%s %`%a %s %^%c %%b %_%$ %%<%%<%%<%$ %%k%%B%%k%$ %#%?%~%<%|%?%+ %%%% % %%% %N %%%% % % %r%q%r%q%s%s%s %Q%%q %r%%Q %%Q %Q% %% % %% %%Q % %Q% %% %} %% %‘%‘%’%’%“%“%ˆ%ˆ % % %N %O % % %Ž %Q%%%e%%%Q %%Z%P%d%P%]% %%X%P%j%P%[% %%Y%%@%%\% %%%%B%%% %‘%‘%’%’%“%“%ˆ%ˆ % % %N %O % % % %Z%P%P%i%P%P%] %%%%4%%% %p%%%4%%%o %p%%%4%%%o %%%%;%%% %—%„%–%›%€%œ %%L%L% %N %%M%M% % %%‚%ƒ%„%…%†%‡%ˆ %%€%˜%™%„%Ÿ