1 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
2 |
"http://www.w3.org/TR/html4/strict.dtd"> |
3 |
<HTML> |
4 |
<HEAD> |
5 |
<meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS"> |
6 |
<TITLE>Unicode設定</TITLE> |
7 |
<META http-equiv="Content-Style-Type" content="text/css"> |
8 |
<link rel="stylesheet" href="../style.css" type="text/css"> |
9 |
</HEAD> |
10 |
<BODY> |
11 |
|
12 |
<h1>Unicode設定</h1> |
13 |
|
14 |
<p> |
15 |
UTF-8化を行うには、Tera Term Proの「Setup」メニューから「General」を選び、Languageを"Japanese"へ変更します。 |
16 |
次に、「Setup」メニューから「Terminal」を選択すると、ダイアログが出てくるので、「Kanji(receive)」および「Kanji(transmit)」にて、【UTF-8】を選んでください。Tera Term Proの再起動は必要ありません。<br> |
17 |
コマンドラインの「/KT」および「/KR」オプションにおいて、"UTF8"を指定すると、送信および受信コードにUTF-8を設定することができます。 |
18 |
</p> |
19 |
|
20 |
<p> |
21 |
現状のTera Termは内部設計がUnicode対応になっておらず、以下に示すように文字コードは二段変換になっています。 |
22 |
|
23 |
<pre> |
24 |
UTF-8 <-----> Unicode(UTF-16LE) <-----> MBCS |
25 |
(1) (2) |
26 |
</pre> |
27 |
|
28 |
(1)において、UTF-8は3バイトまでしか変換していないため、サロゲートペア(surrogate pair)や結合文字(combining character)、合成済み文字(decomposed form)などには対応していません。<br> |
29 |
(2)において、UnicodeとMBCS(Multiple Byte Character Set)の相互変換を行うために、コードページの指定が必要です。コードページはMicrosoftが拡張した文字セットを表す数値のことで、国ごとにより付いている番号のことです。<br> |
30 |
ゆえに、ローカライズされたWindows上では、ローカライズされた言語しか扱うことができません。たとえば、日本語版Windowsでは日本語以外の言語は文字化けします。同様に、英語版Windowsでは日本語は扱えません。 |
31 |
</p> |
32 |
|
33 |
<p> |
34 |
Unicodeによるローカライズ言語を扱えるようにするためには、teraterm.iniファイルにロケールおよびコードページの設定が必要となります。下記にサンプルを示します。 |
35 |
</p> |
36 |
|
37 |
<pre> |
38 |
---------------------- |
39 |
; Locale for Unicode |
40 |
Locale = japanese |
41 |
|
42 |
; CodePage for Unicode |
43 |
CodePage = 932 |
44 |
---------------------- |
45 |
</pre> |
46 |
|
47 |
<p> |
48 |
ロケールおよびコードページに設定できる内容については、下記のサイトを参照してください。<br> |
49 |
<A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_language_strings.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_language_strings.asp</A><br> |
50 |
<A HREF="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp</A> |
51 |
</p> |
52 |
|
53 |
<pre> |
54 |
[Example of WindowsXP Simplified Chinese] |
55 |
----------------------------------------- |
56 |
; Locale for Unicode |
57 |
Locale = chs |
58 |
|
59 |
; CodePage for Unicode |
60 |
CodePage = 936 |
61 |
----------------------------------------- |
62 |
</pre> |
63 |
|
64 |
<pre> |
65 |
[Example of WindowsXP USA] |
66 |
----------------------------------------- |
67 |
; Locale for Unicode |
68 |
Locale = american |
69 |
|
70 |
; CodePage for Unicode |
71 |
CodePage = 65001 |
72 |
----------------------------------------- |
73 |
</pre> |
74 |
|
75 |
|
76 |
<p> |
77 |
※注意:Mac OS X<br> |
78 |
【UTF-8m】はMac OS X(HFS+)向けのエンコーディング指定です。受信のみのサポートです。<br> |
79 |
「/KR」コマンドラインでは"UTF8m"を指定することができます。 |
80 |
</p> |
81 |
|
82 |
<p> |
83 |
[NOTE] Language Strings for Locale |
84 |
</p> |
85 |
<pre> |
86 |
Primary Sublanguage String |
87 |
---------------+--------------+------------------------------------------------------- |
88 |
Chinese Chinese "chinese" |
89 |
Chinese Chinese (simplified) "chinese-simplified" or "chs" |
90 |
Chinese Chinese (traditional) "chinese-traditional" or "cht" |
91 |
Czech Czech "csy" or "czech" |
92 |
Danish Danish "dan"or "danish" |
93 |
Dutch Dutch (Belgian) "belgian", "dutch-belgian", or "nlb" |
94 |
Dutch Dutch (default) "dutch" or "nld" |
95 |
English English (Australian) "australian", "ena", or "english-aus" |
96 |
English English (Canadian) "canadian", "enc", or "english-can" |
97 |
English English (default) "english" |
98 |
English English (New Zealand) "english-nz" or "enz" |
99 |
English English (UK) "eng", "english-uk", or "uk" |
100 |
English English (USA) "american", "american english", "american-english", "english-american", "english-us", "english-usa", "enu", "us", or "usa" |
101 |
Finnish Finnish "fin" or "finnish" |
102 |
French French (Belgian) "frb" or "french-belgian" |
103 |
French French (Canadian) "frc" or "french-canadian" |
104 |
French French (default) "fra"or "french" |
105 |
French French (Swiss) "french-swiss" or "frs" |
106 |
German German (Austrian) "dea" or "german-austrian" |
107 |
German German (default) "deu" or "german" |
108 |
German German (Swiss) "des", "german-swiss", or "swiss" |
109 |
Greek Greek "ell" or "greek" |
110 |
Hungarian Hungarian "hun" or "hungarian" |
111 |
Icelandic Icelandic "icelandic" or "isl" |
112 |
Italian Italian (default) "ita" or "italian" |
113 |
Italian Italian (Swiss) "italian-swiss" or "its" |
114 |
Japanese Japanese "japanese" or "jpn" |
115 |
Korean Korean "kor" or "korean" |
116 |
Norwegian Norwegian (Bokmal) "nor" or "norwegian-bokmal" |
117 |
Norwegian Norwegian (default) "norwegian" |
118 |
Norwegian Norwegian (Nynorsk) "non" or "norwegian-nynorsk" |
119 |
Polish Polish "plk" or "polish" |
120 |
Portuguese Portuguese (Brazil) "portuguese-brazilian" or "ptb" |
121 |
Portuguese Portuguese (default) "portuguese" or "ptg" |
122 |
Russian Russian (default) "rus" or "russian" |
123 |
Slovak Slovak "sky" or "slovak" |
124 |
Spanish Spanish (default) "esp" or "spanish" |
125 |
Spanish Spanish (Mexican) "esm" or "spanish-mexican" |
126 |
Spanish Spanish (Modern) "esn" or "spanish-modern" |
127 |
Swedish Swedish "sve" or "swedish" |
128 |
Turkish Turkish "trk" or "turkish" |
129 |
</pre> |
130 |
|
131 |
<p> |
132 |
[NOTE] Code-Page Identifiers |
133 |
</p> |
134 |
<pre> |
135 |
Identifier Name |
136 |
037 IBM EBCDIC - U.S./Canada |
137 |
437 OEM - United States |
138 |
500 IBM EBCDIC - International |
139 |
708 Arabic - ASMO 708 |
140 |
709 Arabic - ASMO 449+, BCON V4 |
141 |
710 Arabic - Transparent Arabic |
142 |
720 Arabic - Transparent ASMO |
143 |
737 OEM - Greek (formerly 437G) |
144 |
775 OEM - Baltic |
145 |
850 OEM - Multilingual Latin I |
146 |
852 OEM - Latin II |
147 |
855 OEM - Cyrillic (primarily Russian) |
148 |
857 OEM - Turkish |
149 |
858 OEM - Multlingual Latin I + Euro symbol |
150 |
860 OEM - Portuguese |
151 |
861 OEM - Icelandic |
152 |
862 OEM - Hebrew |
153 |
863 OEM - Canadian-French |
154 |
864 OEM - Arabic |
155 |
865 OEM - Nordic |
156 |
866 OEM - Russian |
157 |
869 OEM - Modern Greek |
158 |
870 IBM EBCDIC - Multilingual/ROECE (Latin-2) |
159 |
874 ANSI/OEM - Thai (same as 28605, ISO 8859-15) |
160 |
875 IBM EBCDIC - Modern Greek |
161 |
932 ANSI/OEM - Japanese, Shift-JIS |
162 |
936 ANSI/OEM - Simplified Chinese (PRC, Singapore) |
163 |
949 ANSI/OEM - Korean (Unified Hangeul Code) |
164 |
950 ANSI/OEM - Traditional Chinese (Taiwan; Hong Kong SAR, PRC) |
165 |
1026 IBM EBCDIC - Turkish (Latin-5) |
166 |
1047 IBM EBCDIC - Latin 1/Open System |
167 |
1140 IBM EBCDIC - U.S./Canada (037 + Euro symbol) |
168 |
1141 IBM EBCDIC - Germany (20273 + Euro symbol) |
169 |
1142 IBM EBCDIC - Denmark/Norway (20277 + Euro symbol) |
170 |
1143 IBM EBCDIC - Finland/Sweden (20278 + Euro symbol) |
171 |
1144 IBM EBCDIC - Italy (20280 + Euro symbol) |
172 |
1145 IBM EBCDIC - Latin America/Spain (20284 + Euro symbol) |
173 |
1146 IBM EBCDIC - United Kingdom (20285 + Euro symbol) |
174 |
1147 IBM EBCDIC - France (20297 + Euro symbol) |
175 |
1148 IBM EBCDIC - International (500 + Euro symbol) |
176 |
1149 IBM EBCDIC - Icelandic (20871 + Euro symbol) |
177 |
1200 Unicode UCS-2 Little-Endian (BMP of ISO 10646) |
178 |
1201 Unicode UCS-2 Big-Endian |
179 |
1250 ANSI - Central European |
180 |
1251 ANSI - Cyrillic |
181 |
1252 ANSI - Latin I |
182 |
1253 ANSI - Greek |
183 |
1254 ANSI - Turkish |
184 |
1255 ANSI - Hebrew |
185 |
1256 ANSI - Arabic |
186 |
1257 ANSI - Baltic |
187 |
1258 ANSI/OEM - Vietnamese |
188 |
1361 Korean (Johab) |
189 |
10000 MAC - Roman |
190 |
10001 MAC - Japanese |
191 |
10002 MAC - Traditional Chinese (Big5) |
192 |
10003 MAC - Korean |
193 |
10004 MAC - Arabic |
194 |
10005 MAC - Hebrew |
195 |
10006 MAC - Greek I |
196 |
10007 MAC - Cyrillic |
197 |
10008 MAC - Simplified Chinese (GB 2312) |
198 |
10010 MAC - Romania |
199 |
10017 MAC - Ukraine |
200 |
10021 MAC - Thai |
201 |
10029 MAC - Latin II |
202 |
10079 MAC - Icelandic |
203 |
10081 MAC - Turkish |
204 |
10082 MAC - Croatia |
205 |
12000 Unicode UCS-4 Little-Endian |
206 |
12001 Unicode UCS-4 Big-Endian |
207 |
20000 CNS - Taiwan |
208 |
20001 TCA - Taiwan |
209 |
20002 Eten - Taiwan |
210 |
20003 IBM5550 - Taiwan |
211 |
20004 TeleText - Taiwan |
212 |
20005 Wang - Taiwan |
213 |
20105 IA5 IRV International Alphabet No. 5 (7-bit) |
214 |
20106 IA5 German (7-bit) |
215 |
20107 IA5 Swedish (7-bit) |
216 |
20108 IA5 Norwegian (7-bit) |
217 |
20127 US-ASCII (7-bit) |
218 |
20261 T.61 |
219 |
20269 ISO 6937 Non-Spacing Accent |
220 |
20273 IBM EBCDIC - Germany |
221 |
20277 IBM EBCDIC - Denmark/Norway |
222 |
20278 IBM EBCDIC - Finland/Sweden |
223 |
20280 IBM EBCDIC - Italy |
224 |
20284 IBM EBCDIC - Latin America/Spain |
225 |
20285 IBM EBCDIC - United Kingdom |
226 |
20290 IBM EBCDIC - Japanese Katakana Extended |
227 |
20297 IBM EBCDIC - France |
228 |
20420 IBM EBCDIC - Arabic |
229 |
20423 IBM EBCDIC - Greek |
230 |
20424 IBM EBCDIC - Hebrew |
231 |
20833 IBM EBCDIC - Korean Extended |
232 |
20838 IBM EBCDIC - Thai |
233 |
20866 Russian - KOI8-R |
234 |
20871 IBM EBCDIC - Icelandic |
235 |
20880 IBM EBCDIC - Cyrillic (Russian) |
236 |
20905 IBM EBCDIC - Turkish |
237 |
20924 IBM EBCDIC - Latin-1/Open System (1047 + Euro symbol) |
238 |
20932 JIS X 0208-1990 & 0121-1990 |
239 |
20936 Simplified Chinese (GB2312) |
240 |
21025 IBM EBCDIC - Cyrillic (Serbian, Bulgarian) |
241 |
21027 Extended Alpha Lowercase |
242 |
21866 Ukrainian (KOI8-U) |
243 |
28591 ISO 8859-1 Latin I |
244 |
28592 ISO 8859-2 Central Europe |
245 |
28593 ISO 8859-3 Latin 3 |
246 |
28594 ISO 8859-4 Baltic |
247 |
28595 ISO 8859-5 Cyrillic |
248 |
28596 ISO 8859-6 Arabic |
249 |
28597 ISO 8859-7 Greek |
250 |
28598 ISO 8859-8 Hebrew |
251 |
28599 ISO 8859-9 Latin 5 |
252 |
28605 ISO 8859-15 Latin 9 |
253 |
29001 Europa 3 |
254 |
38598 ISO 8859-8 Hebrew |
255 |
50220 ISO 2022 Japanese with no halfwidth Katakana |
256 |
50221 ISO 2022 Japanese with halfwidth Katakana |
257 |
50222 ISO 2022 Japanese JIS X 0201-1989 |
258 |
50225 ISO 2022 Korean |
259 |
50227 ISO 2022 Simplified Chinese |
260 |
50229 ISO 2022 Traditional Chinese |
261 |
50930 Japanese (Katakana) Extended |
262 |
50931 US/Canada and Japanese |
263 |
50933 Korean Extended and Korean |
264 |
50935 Simplified Chinese Extended and Simplified Chinese |
265 |
50936 Simplified Chinese |
266 |
50937 US/Canada and Traditional Chinese |
267 |
50939 Japanese (Latin) Extended and Japanese |
268 |
51932 EUC - Japanese |
269 |
51936 EUC - Simplified Chinese |
270 |
51949 EUC - Korean |
271 |
51950 EUC - Traditional Chinese |
272 |
52936 HZ-GB2312 Simplified Chinese |
273 |
54936 Windows XP: GB18030 Simplified Chinese (4 Byte) |
274 |
57002 ISCII Devanagari |
275 |
57003 ISCII Bengali |
276 |
57004 ISCII Tamil |
277 |
57005 ISCII Telugu |
278 |
57006 ISCII Assamese |
279 |
57007 ISCII Oriya |
280 |
57008 ISCII Kannada |
281 |
57009 ISCII Malayalam |
282 |
57010 ISCII Gujarati |
283 |
57011 ISCII Punjabi |
284 |
65000 Unicode UTF-7 |
285 |
65001 Unicode UTF-8 |
286 |
</pre> |
287 |
|
288 |
</BODY> |
289 |
</HTML> |