1 |
maya |
331 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
2 |
|
|
"http://www.w3.org/TR/html4/strict.dtd"> |
3 |
|
|
<HTML> |
4 |
|
|
<HEAD> |
5 |
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
6 |
|
|
<TITLE>Unicode</TITLE> |
7 |
|
|
<META http-equiv="Content-Style-Type" content="text/css"> |
8 |
|
|
<link rel="stylesheet" href="../style.css" type="text/css"> |
9 |
|
|
</HEAD> |
10 |
|
|
<BODY> |
11 |
|
|
|
12 |
maya |
368 |
<h1>Unicode</h1> |
13 |
maya |
331 |
|
14 |
maya |
368 |
<p> |
15 |
doda |
932 |
To use UTF-8, changed from English to Japanese under Setup->General menu and select "Terminal" from the Tera Term Pro "Setup" menu. Inside the dialog-box, select "UTF-8" from "Kanji(receive)" or "Kanji(transmit)". There is no need to restart Tera Term Pro to activate these configuration changes. |
16 |
maya |
368 |
When "UTF8" is specified with '/KT' or '/KR' option in the command line, UTF-8 encoding/decoding can be used during transmitting and receiving of data. |
17 |
|
|
</p> |
18 |
|
|
|
19 |
maya |
455 |
<p> |
20 |
doda |
932 |
Actually, Tera Term does not support completely Unicode language because the internal design is based on MBCS(Multiple Byte Character Set). So, the Unicode characters are two-step conversion as follows. |
21 |
yutakapon |
760 |
|
22 |
|
|
<pre> |
23 |
|
|
UTF-8 <-----> Unicode(UTF-16LE) <-----> MBCS |
24 |
|
|
(1) (2) |
25 |
|
|
</pre> |
26 |
|
|
|
27 |
doda |
932 |
(1): Tera Term can not support the surrogate pair, the combining character and the decomposed form because the application does not convert UTF-8 byte sequence over three bytes. <br> |
28 |
yutakapon |
760 |
(2): A user must specify the codepage to convert the characters between Unicode and MBCS. The codepage is the enhanced character set by Microsoft, the number differs from one country to another.<br> |
29 |
|
|
Also, a user can only use the localazied language on the localized Windows. As an example, a language other than Japanese will be indecipherable characters on Japanese-language version of Windows. Likewise, Japanese language can not been shown on English-language version of Windows.</p> |
30 |
|
|
|
31 |
|
|
<p> |
32 |
|
|
To enable Unicode character sets with the localized language, you have to set properly the locale and codepage parameters in the 'teraterm.ini' file. See example of these values below. |
33 |
maya |
368 |
</p> |
34 |
|
|
|
35 |
maya |
331 |
<pre> |
36 |
maya |
368 |
---------------------- |
37 |
maya |
331 |
; Locale for Unicode |
38 |
|
|
Locale = japanese |
39 |
|
|
|
40 |
|
|
; CodePage for Unicode |
41 |
|
|
CodePage = 932 |
42 |
maya |
368 |
---------------------- |
43 |
|
|
</pre> |
44 |
maya |
331 |
|
45 |
maya |
368 |
<p> |
46 |
doda |
932 |
Check the following web-sites to learn more about setting of locale and codepage in Tera Term:<br> |
47 |
maya |
368 |
<A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_language_strings.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_language_strings.asp</A><br> |
48 |
|
|
<A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp</A> |
49 |
|
|
</p> |
50 |
maya |
331 |
|
51 |
maya |
368 |
<pre> |
52 |
maya |
331 |
[Example of WindowsXP Simplified Chinese] |
53 |
maya |
368 |
----------------------------------------- |
54 |
maya |
455 |
; Locale for Unicode |
55 |
|
|
Locale = chs |
56 |
maya |
331 |
|
57 |
maya |
455 |
; CodePage for Unicode |
58 |
|
|
CodePage = 936 |
59 |
maya |
368 |
----------------------------------------- |
60 |
|
|
</pre> |
61 |
maya |
331 |
|
62 |
maya |
455 |
<pre> |
63 |
|
|
[Example of WindowsXP USA] |
64 |
|
|
----------------------------------------- |
65 |
|
|
; Locale for Unicode |
66 |
|
|
Locale = american |
67 |
|
|
|
68 |
|
|
; CodePage for Unicode |
69 |
|
|
CodePage = 65001 |
70 |
|
|
----------------------------------------- |
71 |
|
|
</pre> |
72 |
|
|
|
73 |
|
|
|
74 |
maya |
368 |
<p> |
75 |
|
|
[NOTE] for Mac OS X users<br> |
76 |
|
|
For Mac OS X(HFS+) use "UTF-8m" encoding. Currently it only supports receiving mode.<br> |
77 |
|
|
To use this mode specify "UTF8m" as the value of the command line parameter '/KR'. |
78 |
|
|
</p> |
79 |
maya |
331 |
|
80 |
maya |
368 |
<p> |
81 |
|
|
[NOTE] Language Strings for Locale |
82 |
maya |
384 |
</p> |
83 |
maya |
368 |
<pre> |
84 |
|
|
Primary Sublanguage String |
85 |
maya |
331 |
---------------+--------------+------------------------------------------------------- |
86 |
maya |
368 |
Chinese Chinese "chinese" |
87 |
|
|
Chinese Chinese (simplified) "chinese-simplified" or "chs" |
88 |
|
|
Chinese Chinese (traditional) "chinese-traditional" or "cht" |
89 |
|
|
Czech Czech "csy" or "czech" |
90 |
|
|
Danish Danish "dan"or "danish" |
91 |
|
|
Dutch Dutch (Belgian) "belgian", "dutch-belgian", or "nlb" |
92 |
|
|
Dutch Dutch (default) "dutch" or "nld" |
93 |
|
|
English English (Australian) "australian", "ena", or "english-aus" |
94 |
|
|
English English (Canadian) "canadian", "enc", or "english-can" |
95 |
|
|
English English (default) "english" |
96 |
|
|
English English (New Zealand) "english-nz" or "enz" |
97 |
|
|
English English (UK) "eng", "english-uk", or "uk" |
98 |
|
|
English English (USA) "american", "american english", "american-english", "english-american", "english-us", "english-usa", "enu", "us", or "usa" |
99 |
|
|
Finnish Finnish "fin" or "finnish" |
100 |
|
|
French French (Belgian) "frb" or "french-belgian" |
101 |
|
|
French French (Canadian) "frc" or "french-canadian" |
102 |
|
|
French French (default) "fra"or "french" |
103 |
|
|
French French (Swiss) "french-swiss" or "frs" |
104 |
|
|
German German (Austrian) "dea" or "german-austrian" |
105 |
|
|
German German (default) "deu" or "german" |
106 |
|
|
German German (Swiss) "des", "german-swiss", or "swiss" |
107 |
|
|
Greek Greek "ell" or "greek" |
108 |
maya |
384 |
Hungarian Hungarian "hun" or "hungarian" |
109 |
|
|
Icelandic Icelandic "icelandic" or "isl" |
110 |
maya |
368 |
Italian Italian (default) "ita" or "italian" |
111 |
|
|
Italian Italian (Swiss) "italian-swiss" or "its" |
112 |
maya |
384 |
Japanese Japanese "japanese" or "jpn" |
113 |
maya |
368 |
Korean Korean "kor" or "korean" |
114 |
maya |
384 |
Norwegian Norwegian (Bokmal) "nor" or "norwegian-bokmal" |
115 |
|
|
Norwegian Norwegian (default) "norwegian" |
116 |
|
|
Norwegian Norwegian (Nynorsk) "non" or "norwegian-nynorsk" |
117 |
maya |
368 |
Polish Polish "plk" or "polish" |
118 |
maya |
384 |
Portuguese Portuguese (Brazil) "portuguese-brazilian" or "ptb" |
119 |
|
|
Portuguese Portuguese (default) "portuguese" or "ptg" |
120 |
maya |
368 |
Russian Russian (default) "rus" or "russian" |
121 |
|
|
Slovak Slovak "sky" or "slovak" |
122 |
|
|
Spanish Spanish (default) "esp" or "spanish" |
123 |
|
|
Spanish Spanish (Mexican) "esm" or "spanish-mexican" |
124 |
|
|
Spanish Spanish (Modern) "esn" or "spanish-modern" |
125 |
|
|
Swedish Swedish "sve" or "swedish" |
126 |
|
|
Turkish Turkish "trk" or "turkish" |
127 |
|
|
</pre> |
128 |
maya |
331 |
|
129 |
maya |
368 |
<p> |
130 |
|
|
[NOTE] Code-Page Identifiers |
131 |
maya |
384 |
</p> |
132 |
maya |
368 |
<pre> |
133 |
|
|
Identifier Name |
134 |
|
|
037 IBM EBCDIC - U.S./Canada |
135 |
|
|
437 OEM - United States |
136 |
|
|
500 IBM EBCDIC - International |
137 |
|
|
708 Arabic - ASMO 708 |
138 |
|
|
709 Arabic - ASMO 449+, BCON V4 |
139 |
|
|
710 Arabic - Transparent Arabic |
140 |
|
|
720 Arabic - Transparent ASMO |
141 |
|
|
737 OEM - Greek (formerly 437G) |
142 |
|
|
775 OEM - Baltic |
143 |
|
|
850 OEM - Multilingual Latin I |
144 |
|
|
852 OEM - Latin II |
145 |
|
|
855 OEM - Cyrillic (primarily Russian) |
146 |
|
|
857 OEM - Turkish |
147 |
|
|
858 OEM - Multlingual Latin I + Euro symbol |
148 |
|
|
860 OEM - Portuguese |
149 |
|
|
861 OEM - Icelandic |
150 |
|
|
862 OEM - Hebrew |
151 |
|
|
863 OEM - Canadian-French |
152 |
|
|
864 OEM - Arabic |
153 |
|
|
865 OEM - Nordic |
154 |
|
|
866 OEM - Russian |
155 |
|
|
869 OEM - Modern Greek |
156 |
|
|
870 IBM EBCDIC - Multilingual/ROECE (Latin-2) |
157 |
|
|
874 ANSI/OEM - Thai (same as 28605, ISO 8859-15) |
158 |
|
|
875 IBM EBCDIC - Modern Greek |
159 |
|
|
932 ANSI/OEM - Japanese, Shift-JIS |
160 |
|
|
936 ANSI/OEM - Simplified Chinese (PRC, Singapore) |
161 |
|
|
949 ANSI/OEM - Korean (Unified Hangeul Code) |
162 |
|
|
950 ANSI/OEM - Traditional Chinese (Taiwan; Hong Kong SAR, PRC) |
163 |
|
|
1026 IBM EBCDIC - Turkish (Latin-5) |
164 |
|
|
1047 IBM EBCDIC - Latin 1/Open System |
165 |
|
|
1140 IBM EBCDIC - U.S./Canada (037 + Euro symbol) |
166 |
|
|
1141 IBM EBCDIC - Germany (20273 + Euro symbol) |
167 |
|
|
1142 IBM EBCDIC - Denmark/Norway (20277 + Euro symbol) |
168 |
|
|
1143 IBM EBCDIC - Finland/Sweden (20278 + Euro symbol) |
169 |
|
|
1144 IBM EBCDIC - Italy (20280 + Euro symbol) |
170 |
|
|
1145 IBM EBCDIC - Latin America/Spain (20284 + Euro symbol) |
171 |
|
|
1146 IBM EBCDIC - United Kingdom (20285 + Euro symbol) |
172 |
|
|
1147 IBM EBCDIC - France (20297 + Euro symbol) |
173 |
|
|
1148 IBM EBCDIC - International (500 + Euro symbol) |
174 |
|
|
1149 IBM EBCDIC - Icelandic (20871 + Euro symbol) |
175 |
|
|
1200 Unicode UCS-2 Little-Endian (BMP of ISO 10646) |
176 |
|
|
1201 Unicode UCS-2 Big-Endian |
177 |
|
|
1250 ANSI - Central European |
178 |
|
|
1251 ANSI - Cyrillic |
179 |
|
|
1252 ANSI - Latin I |
180 |
|
|
1253 ANSI - Greek |
181 |
|
|
1254 ANSI - Turkish |
182 |
|
|
1255 ANSI - Hebrew |
183 |
|
|
1256 ANSI - Arabic |
184 |
|
|
1257 ANSI - Baltic |
185 |
|
|
1258 ANSI/OEM - Vietnamese |
186 |
|
|
1361 Korean (Johab) |
187 |
|
|
10000 MAC - Roman |
188 |
|
|
10001 MAC - Japanese |
189 |
|
|
10002 MAC - Traditional Chinese (Big5) |
190 |
|
|
10003 MAC - Korean |
191 |
|
|
10004 MAC - Arabic |
192 |
|
|
10005 MAC - Hebrew |
193 |
|
|
10006 MAC - Greek I |
194 |
|
|
10007 MAC - Cyrillic |
195 |
|
|
10008 MAC - Simplified Chinese (GB 2312) |
196 |
|
|
10010 MAC - Romania |
197 |
|
|
10017 MAC - Ukraine |
198 |
|
|
10021 MAC - Thai |
199 |
|
|
10029 MAC - Latin II |
200 |
|
|
10079 MAC - Icelandic |
201 |
|
|
10081 MAC - Turkish |
202 |
|
|
10082 MAC - Croatia |
203 |
|
|
12000 Unicode UCS-4 Little-Endian |
204 |
|
|
12001 Unicode UCS-4 Big-Endian |
205 |
|
|
20000 CNS - Taiwan |
206 |
|
|
20001 TCA - Taiwan |
207 |
|
|
20002 Eten - Taiwan |
208 |
|
|
20003 IBM5550 - Taiwan |
209 |
|
|
20004 TeleText - Taiwan |
210 |
|
|
20005 Wang - Taiwan |
211 |
|
|
20105 IA5 IRV International Alphabet No. 5 (7-bit) |
212 |
|
|
20106 IA5 German (7-bit) |
213 |
|
|
20107 IA5 Swedish (7-bit) |
214 |
|
|
20108 IA5 Norwegian (7-bit) |
215 |
|
|
20127 US-ASCII (7-bit) |
216 |
|
|
20261 T.61 |
217 |
|
|
20269 ISO 6937 Non-Spacing Accent |
218 |
|
|
20273 IBM EBCDIC - Germany |
219 |
|
|
20277 IBM EBCDIC - Denmark/Norway |
220 |
|
|
20278 IBM EBCDIC - Finland/Sweden |
221 |
|
|
20280 IBM EBCDIC - Italy |
222 |
|
|
20284 IBM EBCDIC - Latin America/Spain |
223 |
|
|
20285 IBM EBCDIC - United Kingdom |
224 |
|
|
20290 IBM EBCDIC - Japanese Katakana Extended |
225 |
|
|
20297 IBM EBCDIC - France |
226 |
|
|
20420 IBM EBCDIC - Arabic |
227 |
|
|
20423 IBM EBCDIC - Greek |
228 |
|
|
20424 IBM EBCDIC - Hebrew |
229 |
|
|
20833 IBM EBCDIC - Korean Extended |
230 |
|
|
20838 IBM EBCDIC - Thai |
231 |
|
|
20866 Russian - KOI8-R |
232 |
|
|
20871 IBM EBCDIC - Icelandic |
233 |
|
|
20880 IBM EBCDIC - Cyrillic (Russian) |
234 |
|
|
20905 IBM EBCDIC - Turkish |
235 |
|
|
20924 IBM EBCDIC - Latin-1/Open System (1047 + Euro symbol) |
236 |
maya |
384 |
20932 JIS X 0208-1990 & 0121-1990 |
237 |
maya |
368 |
20936 Simplified Chinese (GB2312) |
238 |
|
|
21025 IBM EBCDIC - Cyrillic (Serbian, Bulgarian) |
239 |
|
|
21027 Extended Alpha Lowercase |
240 |
|
|
21866 Ukrainian (KOI8-U) |
241 |
|
|
28591 ISO 8859-1 Latin I |
242 |
|
|
28592 ISO 8859-2 Central Europe |
243 |
|
|
28593 ISO 8859-3 Latin 3 |
244 |
|
|
28594 ISO 8859-4 Baltic |
245 |
|
|
28595 ISO 8859-5 Cyrillic |
246 |
|
|
28596 ISO 8859-6 Arabic |
247 |
|
|
28597 ISO 8859-7 Greek |
248 |
|
|
28598 ISO 8859-8 Hebrew |
249 |
|
|
28599 ISO 8859-9 Latin 5 |
250 |
|
|
28605 ISO 8859-15 Latin 9 |
251 |
|
|
29001 Europa 3 |
252 |
|
|
38598 ISO 8859-8 Hebrew |
253 |
|
|
50220 ISO 2022 Japanese with no halfwidth Katakana |
254 |
|
|
50221 ISO 2022 Japanese with halfwidth Katakana |
255 |
|
|
50222 ISO 2022 Japanese JIS X 0201-1989 |
256 |
|
|
50225 ISO 2022 Korean |
257 |
|
|
50227 ISO 2022 Simplified Chinese |
258 |
|
|
50229 ISO 2022 Traditional Chinese |
259 |
|
|
50930 Japanese (Katakana) Extended |
260 |
|
|
50931 US/Canada and Japanese |
261 |
|
|
50933 Korean Extended and Korean |
262 |
|
|
50935 Simplified Chinese Extended and Simplified Chinese |
263 |
|
|
50936 Simplified Chinese |
264 |
|
|
50937 US/Canada and Traditional Chinese |
265 |
|
|
50939 Japanese (Latin) Extended and Japanese |
266 |
|
|
51932 EUC - Japanese |
267 |
|
|
51936 EUC - Simplified Chinese |
268 |
|
|
51949 EUC - Korean |
269 |
|
|
51950 EUC - Traditional Chinese |
270 |
|
|
52936 HZ-GB2312 Simplified Chinese |
271 |
|
|
54936 Windows XP: GB18030 Simplified Chinese (4 Byte) |
272 |
|
|
57002 ISCII Devanagari |
273 |
|
|
57003 ISCII Bengali |
274 |
|
|
57004 ISCII Tamil |
275 |
|
|
57005 ISCII Telugu |
276 |
|
|
57006 ISCII Assamese |
277 |
|
|
57007 ISCII Oriya |
278 |
|
|
57008 ISCII Kannada |
279 |
|
|
57009 ISCII Malayalam |
280 |
|
|
57010 ISCII Gujarati |
281 |
|
|
57011 ISCII Punjabi |
282 |
|
|
65000 Unicode UTF-7 |
283 |
|
|
65001 Unicode UTF-8 |
284 |
maya |
331 |
</pre> |
285 |
|
|
|
286 |
|
|
</BODY> |
287 |
|
|
</HTML> |