Open-Source-Software-Entwicklung und Downloads

Browse Subversion Repository

Annotation of /trunk/doc/en/html/usage/unicode.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 932 - (hide annotations) (download) (as text)
Fri Jun 20 11:58:42 2008 UTC (15 years, 11 months ago) by doda
Original Path: doc/trunk/en/html/usage/unicode.html
File MIME type: text/html
File size: 12301 byte(s)
・TeraTerm -> Tera Term
・その他細かい修正

1 maya 331 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2     "http://www.w3.org/TR/html4/strict.dtd">
3     <HTML>
4     <HEAD>
5     <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
6     <TITLE>Unicode</TITLE>
7     <META http-equiv="Content-Style-Type" content="text/css">
8     <link rel="stylesheet" href="../style.css" type="text/css">
9     </HEAD>
10     <BODY>
11    
12 maya 368 <h1>Unicode</h1>
13 maya 331
14 maya 368 <p>
15 doda 932 To use UTF-8, changed from English to Japanese under Setup->General menu and select "Terminal" from the Tera Term Pro "Setup" menu. Inside the dialog-box, select "UTF-8" from "Kanji(receive)" or "Kanji(transmit)". There is no need to restart Tera Term Pro to activate these configuration changes.
16 maya 368 When "UTF8" is specified with '/KT' or '/KR' option in the command line, UTF-8 encoding/decoding can be used during transmitting and receiving of data.
17     </p>
18    
19 maya 455 <p>
20 doda 932 Actually, Tera Term does not support completely Unicode language because the internal design is based on MBCS(Multiple Byte Character Set). So, the Unicode characters are two-step conversion as follows.
21 yutakapon 760
22     <pre>
23     UTF-8 <-----> Unicode(UTF-16LE) <-----> MBCS
24     (1) (2)
25     </pre>
26    
27 doda 932 (1): Tera Term can not support the surrogate pair, the combining character and the decomposed form because the application does not convert UTF-8 byte sequence over three bytes. <br>
28 yutakapon 760 (2): A user must specify the codepage to convert the characters between Unicode and MBCS. The codepage is the enhanced character set by Microsoft, the number differs from one country to another.<br>
29     Also, a user can only use the localazied language on the localized Windows. As an example, a language other than Japanese will be indecipherable characters on Japanese-language version of Windows. Likewise, Japanese language can not been shown on English-language version of Windows.</p>
30    
31     <p>
32     To enable Unicode character sets with the localized language, you have to set properly the locale and codepage parameters in the 'teraterm.ini' file. See example of these values below.
33 maya 368 </p>
34    
35 maya 331 <pre>
36 maya 368 ----------------------
37 maya 331 ; Locale for Unicode
38     Locale = japanese
39    
40     ; CodePage for Unicode
41     CodePage = 932
42 maya 368 ----------------------
43     </pre>
44 maya 331
45 maya 368 <p>
46 doda 932 Check the following web-sites to learn more about setting of locale and codepage in Tera Term:<br>
47 maya 368 <A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_language_strings.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_language_strings.asp</A><br>
48     <A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp</A>
49     </p>
50 maya 331
51 maya 368 <pre>
52 maya 331 [Example of WindowsXP Simplified Chinese]
53 maya 368 -----------------------------------------
54 maya 455 ; Locale for Unicode
55     Locale = chs
56 maya 331
57 maya 455 ; CodePage for Unicode
58     CodePage = 936
59 maya 368 -----------------------------------------
60     </pre>
61 maya 331
62 maya 455 <pre>
63     [Example of WindowsXP USA]
64     -----------------------------------------
65     ; Locale for Unicode
66     Locale = american
67    
68     ; CodePage for Unicode
69     CodePage = 65001
70     -----------------------------------------
71     </pre>
72    
73    
74 maya 368 <p>
75     [NOTE] for Mac OS X users<br>
76     For Mac OS X(HFS+) use "UTF-8m" encoding. Currently it only supports receiving mode.<br>
77     To use this mode specify "UTF8m" as the value of the command line parameter '/KR'.
78     </p>
79 maya 331
80 maya 368 <p>
81     [NOTE] Language Strings for Locale
82 maya 384 </p>
83 maya 368 <pre>
84     Primary Sublanguage String
85 maya 331 ---------------+--------------+-------------------------------------------------------
86 maya 368 Chinese Chinese "chinese"
87     Chinese Chinese (simplified) "chinese-simplified" or "chs"
88     Chinese Chinese (traditional) "chinese-traditional" or "cht"
89     Czech Czech "csy" or "czech"
90     Danish Danish "dan"or "danish"
91     Dutch Dutch (Belgian) "belgian", "dutch-belgian", or "nlb"
92     Dutch Dutch (default) "dutch" or "nld"
93     English English (Australian) "australian", "ena", or "english-aus"
94     English English (Canadian) "canadian", "enc", or "english-can"
95     English English (default) "english"
96     English English (New Zealand) "english-nz" or "enz"
97     English English (UK) "eng", "english-uk", or "uk"
98     English English (USA) "american", "american english", "american-english", "english-american", "english-us", "english-usa", "enu", "us", or "usa"
99     Finnish Finnish "fin" or "finnish"
100     French French (Belgian) "frb" or "french-belgian"
101     French French (Canadian) "frc" or "french-canadian"
102     French French (default) "fra"or "french"
103     French French (Swiss) "french-swiss" or "frs"
104     German German (Austrian) "dea" or "german-austrian"
105     German German (default) "deu" or "german"
106     German German (Swiss) "des", "german-swiss", or "swiss"
107     Greek Greek "ell" or "greek"
108 maya 384 Hungarian Hungarian "hun" or "hungarian"
109     Icelandic Icelandic "icelandic" or "isl"
110 maya 368 Italian Italian (default) "ita" or "italian"
111     Italian Italian (Swiss) "italian-swiss" or "its"
112 maya 384 Japanese Japanese "japanese" or "jpn"
113 maya 368 Korean Korean "kor" or "korean"
114 maya 384 Norwegian Norwegian (Bokmal) "nor" or "norwegian-bokmal"
115     Norwegian Norwegian (default) "norwegian"
116     Norwegian Norwegian (Nynorsk) "non" or "norwegian-nynorsk"
117 maya 368 Polish Polish "plk" or "polish"
118 maya 384 Portuguese Portuguese (Brazil) "portuguese-brazilian" or "ptb"
119     Portuguese Portuguese (default) "portuguese" or "ptg"
120 maya 368 Russian Russian (default) "rus" or "russian"
121     Slovak Slovak "sky" or "slovak"
122     Spanish Spanish (default) "esp" or "spanish"
123     Spanish Spanish (Mexican) "esm" or "spanish-mexican"
124     Spanish Spanish (Modern) "esn" or "spanish-modern"
125     Swedish Swedish "sve" or "swedish"
126     Turkish Turkish "trk" or "turkish"
127     </pre>
128 maya 331
129 maya 368 <p>
130     [NOTE] Code-Page Identifiers
131 maya 384 </p>
132 maya 368 <pre>
133     Identifier Name
134     037 IBM EBCDIC - U.S./Canada
135     437 OEM - United States
136     500 IBM EBCDIC - International
137     708 Arabic - ASMO 708
138     709 Arabic - ASMO 449+, BCON V4
139     710 Arabic - Transparent Arabic
140     720 Arabic - Transparent ASMO
141     737 OEM - Greek (formerly 437G)
142     775 OEM - Baltic
143     850 OEM - Multilingual Latin I
144     852 OEM - Latin II
145     855 OEM - Cyrillic (primarily Russian)
146     857 OEM - Turkish
147     858 OEM - Multlingual Latin I + Euro symbol
148     860 OEM - Portuguese
149     861 OEM - Icelandic
150     862 OEM - Hebrew
151     863 OEM - Canadian-French
152     864 OEM - Arabic
153     865 OEM - Nordic
154     866 OEM - Russian
155     869 OEM - Modern Greek
156     870 IBM EBCDIC - Multilingual/ROECE (Latin-2)
157     874 ANSI/OEM - Thai (same as 28605, ISO 8859-15)
158     875 IBM EBCDIC - Modern Greek
159     932 ANSI/OEM - Japanese, Shift-JIS
160     936 ANSI/OEM - Simplified Chinese (PRC, Singapore)
161     949 ANSI/OEM - Korean (Unified Hangeul Code)
162     950 ANSI/OEM - Traditional Chinese (Taiwan; Hong Kong SAR, PRC)
163     1026 IBM EBCDIC - Turkish (Latin-5)
164     1047 IBM EBCDIC - Latin 1/Open System
165     1140 IBM EBCDIC - U.S./Canada (037 + Euro symbol)
166     1141 IBM EBCDIC - Germany (20273 + Euro symbol)
167     1142 IBM EBCDIC - Denmark/Norway (20277 + Euro symbol)
168     1143 IBM EBCDIC - Finland/Sweden (20278 + Euro symbol)
169     1144 IBM EBCDIC - Italy (20280 + Euro symbol)
170     1145 IBM EBCDIC - Latin America/Spain (20284 + Euro symbol)
171     1146 IBM EBCDIC - United Kingdom (20285 + Euro symbol)
172     1147 IBM EBCDIC - France (20297 + Euro symbol)
173     1148 IBM EBCDIC - International (500 + Euro symbol)
174     1149 IBM EBCDIC - Icelandic (20871 + Euro symbol)
175     1200 Unicode UCS-2 Little-Endian (BMP of ISO 10646)
176     1201 Unicode UCS-2 Big-Endian
177     1250 ANSI - Central European
178     1251 ANSI - Cyrillic
179     1252 ANSI - Latin I
180     1253 ANSI - Greek
181     1254 ANSI - Turkish
182     1255 ANSI - Hebrew
183     1256 ANSI - Arabic
184     1257 ANSI - Baltic
185     1258 ANSI/OEM - Vietnamese
186     1361 Korean (Johab)
187     10000 MAC - Roman
188     10001 MAC - Japanese
189     10002 MAC - Traditional Chinese (Big5)
190     10003 MAC - Korean
191     10004 MAC - Arabic
192     10005 MAC - Hebrew
193     10006 MAC - Greek I
194     10007 MAC - Cyrillic
195     10008 MAC - Simplified Chinese (GB 2312)
196     10010 MAC - Romania
197     10017 MAC - Ukraine
198     10021 MAC - Thai
199     10029 MAC - Latin II
200     10079 MAC - Icelandic
201     10081 MAC - Turkish
202     10082 MAC - Croatia
203     12000 Unicode UCS-4 Little-Endian
204     12001 Unicode UCS-4 Big-Endian
205     20000 CNS - Taiwan
206     20001 TCA - Taiwan
207     20002 Eten - Taiwan
208     20003 IBM5550 - Taiwan
209     20004 TeleText - Taiwan
210     20005 Wang - Taiwan
211     20105 IA5 IRV International Alphabet No. 5 (7-bit)
212     20106 IA5 German (7-bit)
213     20107 IA5 Swedish (7-bit)
214     20108 IA5 Norwegian (7-bit)
215     20127 US-ASCII (7-bit)
216     20261 T.61
217     20269 ISO 6937 Non-Spacing Accent
218     20273 IBM EBCDIC - Germany
219     20277 IBM EBCDIC - Denmark/Norway
220     20278 IBM EBCDIC - Finland/Sweden
221     20280 IBM EBCDIC - Italy
222     20284 IBM EBCDIC - Latin America/Spain
223     20285 IBM EBCDIC - United Kingdom
224     20290 IBM EBCDIC - Japanese Katakana Extended
225     20297 IBM EBCDIC - France
226     20420 IBM EBCDIC - Arabic
227     20423 IBM EBCDIC - Greek
228     20424 IBM EBCDIC - Hebrew
229     20833 IBM EBCDIC - Korean Extended
230     20838 IBM EBCDIC - Thai
231     20866 Russian - KOI8-R
232     20871 IBM EBCDIC - Icelandic
233     20880 IBM EBCDIC - Cyrillic (Russian)
234     20905 IBM EBCDIC - Turkish
235     20924 IBM EBCDIC - Latin-1/Open System (1047 + Euro symbol)
236 maya 384 20932 JIS X 0208-1990 &amp; 0121-1990
237 maya 368 20936 Simplified Chinese (GB2312)
238     21025 IBM EBCDIC - Cyrillic (Serbian, Bulgarian)
239     21027 Extended Alpha Lowercase
240     21866 Ukrainian (KOI8-U)
241     28591 ISO 8859-1 Latin I
242     28592 ISO 8859-2 Central Europe
243     28593 ISO 8859-3 Latin 3
244     28594 ISO 8859-4 Baltic
245     28595 ISO 8859-5 Cyrillic
246     28596 ISO 8859-6 Arabic
247     28597 ISO 8859-7 Greek
248     28598 ISO 8859-8 Hebrew
249     28599 ISO 8859-9 Latin 5
250     28605 ISO 8859-15 Latin 9
251     29001 Europa 3
252     38598 ISO 8859-8 Hebrew
253     50220 ISO 2022 Japanese with no halfwidth Katakana
254     50221 ISO 2022 Japanese with halfwidth Katakana
255     50222 ISO 2022 Japanese JIS X 0201-1989
256     50225 ISO 2022 Korean
257     50227 ISO 2022 Simplified Chinese
258     50229 ISO 2022 Traditional Chinese
259     50930 Japanese (Katakana) Extended
260     50931 US/Canada and Japanese
261     50933 Korean Extended and Korean
262     50935 Simplified Chinese Extended and Simplified Chinese
263     50936 Simplified Chinese
264     50937 US/Canada and Traditional Chinese
265     50939 Japanese (Latin) Extended and Japanese
266     51932 EUC - Japanese
267     51936 EUC - Simplified Chinese
268     51949 EUC - Korean
269     51950 EUC - Traditional Chinese
270     52936 HZ-GB2312 Simplified Chinese
271     54936 Windows XP: GB18030 Simplified Chinese (4 Byte)
272     57002 ISCII Devanagari
273     57003 ISCII Bengali
274     57004 ISCII Tamil
275     57005 ISCII Telugu
276     57006 ISCII Assamese
277     57007 ISCII Oriya
278     57008 ISCII Kannada
279     57009 ISCII Malayalam
280     57010 ISCII Gujarati
281     57011 ISCII Punjabi
282     65000 Unicode UTF-7
283     65001 Unicode UTF-8
284 maya 331 </pre>
285    
286     </BODY>
287     </HTML>

Back to OSDN">Back to OSDN
ViewVC Help
Powered by ViewVC 1.1.26