This file is indexed.

/usr/share/debian-reference/ch08.en.html is in debian-reference-en 2.53.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Chapter 8. I18N and L10N</title>
    <link rel="stylesheet" type="text/css" href="debian-reference.css"/>
    <meta name="generator" content="DocBook XSL Stylesheets V1.78.1"/>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
    <link rel="home" href="index.en.html" title="Debian Reference"/>
    <link rel="up" href="index.en.html" title="Debian Reference"/>
    <link rel="prev" href="ch07.en.html" title="Chapter 7. The X Window System"/>
    <link rel="next" href="ch09.en.html" title="Chapter 9. System tips"/>
  </head>
  <body>
    <div class="navheader">
      <table width="100%" summary="Navigation header">
        <tr>
          <th colspan="3" align="center">Chapter 8. I18N and L10N</th>
        </tr>
        <tr>
          <td align="left"><a accesskey="p" href="ch07.en.html"><img src="images/prev.gif" alt="Prev"/></a> </td>
          <th width="60%" align="center"> </th>
          <td align="right"> <a accesskey="n" href="ch09.en.html"><img src="images/next.gif" alt="Next"/></a></td>
        </tr>
      </table>
      <hr/>
    </div>
    <div class="chapter">
      <div class="titlepage">
        <div>
          <div>
            <h1 class="title"><a id="_i18n_and_l10n"/>Chapter 8. I18N and L10N</h1>
          </div>
        </div>
      </div>
      <div class="toc">
        <p>
          <strong>Table of Contents</strong>
        </p>
        <dl class="toc">
          <dt>
            <span class="section">
              <a href="ch08.en.html#_the_keyboard_input">8.1. The keyboard input</a>
            </span>
          </dt>
          <dd>
            <dl>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_the_input_method_support_with_ibus">8.1.1. The input method support with IBus</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_an_example_for_japanese">8.1.2. An example for Japanese</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_disabling_the_input_method">8.1.3. Disabling the input method</a>
                </span>
              </dt>
            </dl>
          </dd>
          <dt>
            <span class="section">
              <a href="ch08.en.html#_the_display_output">8.2. The display output</a>
            </span>
          </dt>
          <dt>
            <span class="section">
              <a href="ch08.en.html#_the_locale">8.3. The locale</a>
            </span>
          </dt>
          <dd>
            <dl>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_basics_of_encoding">8.3.1. Basics of encoding</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_rationale_for_utf_8_locale">8.3.2. Rationale for UTF-8 locale</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_the_reconfiguration_of_the_locale">8.3.3. The reconfiguration of the locale</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_the_value_of_the_literal_lang_literal_environment_variable">8.3.4. The value of the "<code class="literal">$LANG</code>" environment variable</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_specific_locale_only_under_x_window">8.3.5. Specific locale only under X Window</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_filename_encoding">8.3.6. Filename encoding</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_localized_messages_and_translated_documentation">8.3.7. Localized messages and translated documentation</a>
                </span>
              </dt>
              <dt>
                <span class="section">
                  <a href="ch08.en.html#_effects_of_the_locale">8.3.8. Effects of the locale</a>
                </span>
              </dt>
            </dl>
          </dd>
        </dl>
      </div>
      <p><a class="ulink" href="http://en.wikipedia.org/wiki/Internationalization_and_localization">Multilingualization (M17N) or Native Language Support</a> for an application software is done in 2 steps.</p>
      <div class="itemizedlist">
        <ul class="itemizedlist">
          <li class="listitem">
            <p>
Internationalization (I18N): To make a software potentially handle multiple locales.
</p>
          </li>
          <li class="listitem">
            <p>
Localization (L10N): To make a software handle an specific locale.
</p>
          </li>
        </ul>
      </div>
      <div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;">
        <table border="0" summary="Tip">
          <tr>
            <td rowspan="2" align="center" valign="top">
              <img alt="[Tip]" src="images/tip.png"/>
            </td>
            <th align="left">Tip</th>
          </tr>
          <tr>
            <td align="left" valign="top">
              <p>There are 17, 18, or 10 letters between "m" and "n", "i" and "n", or "l" and "n" in multilingualization, internationalization, and localization which correspond to M17N, I18N, and L10N.</p>
            </td>
          </tr>
        </table>
      </div>
      <p>The modern software such as GNOME and KDE are multilingualized.  They are internationalized by making them handle <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> data and localized by providing their translated messages through the <span class="citerefentry"><span class="refentrytitle">gettext</span>(1)</span> infrastructure.  Translated messages may be provided as separate localization packages.  They can be selected simply by setting pertinent environment variables to the appropriate locale.</p>
      <p>The simplest representation of the text data is <span class="strong"><strong>ASCII</strong></span> which is sufficient for English and uses less than 127 characters (representable with 7 bits).  In order to support much more characters for the international support, many character encoding systems have been invented.  The modern and sensible encoding system is <span class="strong"><strong>UTF-8</strong></span> which can handle practically all the characters known to the human (see <a class="xref" href="ch08.en.html#_basics_of_encoding" title="8.3.1. Basics of encoding">Section 8.3.1, “Basics of encoding”</a>).</p>
      <p>See <a class="ulink" href="http://www.debian.org/doc/manuals/intro-i18n/">Introduction to i18n</a> for details.</p>
      <p>The international hardware support is enabled with localized hardware configuration data.</p>
      <div class="section">
        <div class="titlepage">
          <div>
            <div>
              <h2 class="title"><a id="_the_keyboard_input"/>8.1. The keyboard input</h2>
            </div>
          </div>
        </div>
        <p>The Debian system can be configured to work with many international keyboard arrangements.</p>
        <div class="table">
          <a id="listofkeyboardreigurationmethods"/>
          <p class="title">
            <strong>Table 8.1. List of keyboard reconfiguration methods</strong>
          </p>
          <div class="table-contents">
            <table summary="List of keyboard reconfiguration methods" border="1">
              <colgroup>
                <col style="text-align: left"/>
                <col style="text-align: left"/>
              </colgroup>
              <thead>
                <tr>
                  <th style="text-align: left">
    environment
    </th>
                  <th style="text-align: left">
    command
    </th>
                </tr>
              </thead>
              <tbody>
                <tr>
                  <td style="text-align: left">
    Linux console
    </td>
                  <td style="text-align: left">
                <code class="literal">dpkg-reconfigure --priority=low console-data</code>
              </td>
                </tr>
                <tr>
                  <td style="text-align: left">
    X Window
    </td>
                  <td style="text-align: left">
                <code class="literal">dpkg-reconfigure --priority=low xserver-xorg</code>
              </td>
                </tr>
              </tbody>
            </table>
          </div>
        </div>
        <br class="table-break"/>
        <p>This supports keyboard input for accented characters of many European languages with its dead-key function. For Asian languages, you need more complicated <a class="ulink" href="http://en.wikipedia.org/wiki/Input_method">input method</a> support such as <a class="ulink" href="http://en.wikipedia.org/wiki/Intelligent_Input_Bus">IBus</a> discussed next.</p>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_the_input_method_support_with_ibus"/>8.1.1. The input method support with IBus</h3>
              </div>
            </div>
          </div>
          <p>Setup of multilingual input for the Debian system is simplified by using the <a class="ulink" href="http://en.wikipedia.org/wiki/Intelligent_Input_Bus">IBus</a> family of packages with the <code class="literal">im-config</code> package. The list of IBus packages are the following.</p>
          <div class="table">
            <a id="listofinputmethosupportswithibus"/>
            <p class="title">
              <strong>Table 8.2. List of input method supports with IBus</strong>
            </p>
            <div class="table-contents">
              <table summary="List of input method supports with IBus" border="1">
                <colgroup>
                  <col style="text-align: left"/>
                  <col style="text-align: left"/>
                  <col style="text-align: left"/>
                  <col style="text-align: left"/>
                </colgroup>
                <thead>
                  <tr>
                    <th style="text-align: left">
    package
    </th>
                    <th style="text-align: left">
    popcon
    </th>
                    <th style="text-align: left">
    size
    </th>
                    <th style="text-align: left">
    supported locale
    </th>
                  </tr>
                </thead>
                <tbody>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus">
    ibus
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus">V:6, I:9</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus.html">1998</a>
                    </td>
                    <td style="text-align: left">
    input method framework using dbus
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-mozc">
    ibus-mozc
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-mozc">V:1, I:1</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-mozc.html">886</a>
                    </td>
                    <td style="text-align: left">
    Japanese
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-anthy">
    ibus-anthy
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-anthy">V:1, I:3</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-anthy.html">719</a>
                    </td>
                    <td style="text-align: left">
    , ,
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-skk">
    ibus-skk
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-skk">V:0, I:0</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-skk.html">230</a>
                    </td>
                    <td style="text-align: left">
    , ,
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-pinyin">
    ibus-pinyin
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-pinyin">V:1, I:2</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-pinyin.html">1437</a>
                    </td>
                    <td style="text-align: left">
    Chinese (for zh_CN)
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-chewing">
    ibus-chewing
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-chewing">V:0, I:0</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-chewing.html">213</a>
                    </td>
                    <td style="text-align: left">
    , ,     (for zh_TW)
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-hangul">
    ibus-hangul
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-hangul">V:0, I:0</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-hangul.html">292</a>
                    </td>
                    <td style="text-align: left">
    Korean
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-table">
    ibus-table
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-table">V:1, I:2</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-table.html">706</a>
                    </td>
                    <td style="text-align: left">
    table engine for IBus
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-table-thai">
    ibus-table-thai
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-table-thai">I:0</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-table-thai.html">143</a>
                    </td>
                    <td style="text-align: left">
    Thai
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-unikey">
    ibus-unikey
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-unikey">V:0, I:0</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-unikey.html">276</a>
                    </td>
                    <td style="text-align: left">
    Vietnamese
    </td>
                  </tr>
                  <tr>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.debian.org/sid/ibus-m17n">
    ibus-m17n
    </a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://qa.debian.org/popcon.php?package=ibus-m17n">V:0, I:0</a>
                    </td>
                    <td style="text-align: left">
                      <a class="ulink" href="http://packages.qa.debian.org/i/ibus-m17n.html">163</a>
                    </td>
                    <td style="text-align: left">
    Multilingual: Indic, Arabic and others
    </td>
                  </tr>
                </tbody>
              </table>
            </div>
          </div>
          <br class="table-break"/>
          <p>The kinput2 method and other locale dependent Asian classic <a class="ulink" href="http://en.wikipedia.org/wiki/Input_method">input methods</a> still exist but are not recommended for the modern UTF-8 X environment.  The <a class="ulink" href="http://en.wikipedia.org/wiki/Smart_Common_Input_Method">SCIM</a> and <a class="ulink" href="http://en.wikipedia.org/wiki/Uim">uim</a> tool chains are an slightly older approach for the international input method for the modern UTF-8 X environment.</p>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_an_example_for_japanese"/>8.1.2. An example for Japanese</h3>
              </div>
            </div>
          </div>
          <p>I find the Japanese input method started under English environment ("<code class="literal">en_US.UTF-8</code>") very useful.  Here is how I did this with IBus for GNOME3:</p>
          <div class="orderedlist">
            <ol class="orderedlist">
              <li class="listitem">
                <p>
Install the Japanese input tool package <code class="literal">ibus-anthy</code> with its recommended packages such as <code class="literal">im-config</code>.
</p>
              </li>
              <li class="listitem">
                <p>
Execute "<code class="literal">im-config</code>" from user's shell and select "<code class="literal">ibus</code>" as the input method.
</p>
              </li>
              <li class="listitem">
                <p>
Select "Settings" → "Keyboard" → "Input Sources" → click "<code class="literal">+</code>" in "Input Sources" → "Japanese" → "Japanese (anthy)" and click "Add".
</p>
              </li>
              <li class="listitem">
                <p>
Select "Japanese" and click "Add" to support the Japanese layout keyboard without character conversion. (You may chose as many input sources.)
</p>
              </li>
              <li class="listitem">
                <p>
Relogin to user's account.
</p>
              </li>
              <li class="listitem">
                <p>
Verify setting by "<code class="literal">im-config</code>".
</p>
              </li>
              <li class="listitem">
                <p>
Setup input source by right clicking the GUI toolbar icon.
</p>
              </li>
              <li class="listitem">
                <p>
Switch among installed input sources by SUPER-SPACE. (SUPER is normally the Windows key.)
</p>
              </li>
            </ol>
          </div>
          <p>Please note the following.</p>
          <div class="itemizedlist">
            <ul class="itemizedlist">
              <li class="listitem">
                <p><span class="citerefentry"><span class="refentrytitle">im-config</span>(8)</span> behaves differently if command is executed from root or not.
</p>
              </li>
              <li class="listitem">
                <p><span class="citerefentry"><span class="refentrytitle">im-config</span>(8)</span> enables the best input method on the system as default without any user actions.
</p>
              </li>
              <li class="listitem">
                <p>
The GUI menu entry for <span class="citerefentry"><span class="refentrytitle">im-config</span>(8)</span> is disabled as default to prevent cluttering.
</p>
              </li>
            </ul>
          </div>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_disabling_the_input_method"/>8.1.3. Disabling the input method</h3>
              </div>
            </div>
          </div>
          <p>If you wish to input without going through XIM, set "<code class="literal">$XMODIFIERS</code>" value to "none" while starting a program. This may be the case if you use Japanese input infrastructure <code class="literal">egg</code> on <span class="citerefentry"><span class="refentrytitle">emacs</span>(1)</span>. From shell, execute as the following.</p>
          <pre class="screen">$ XMODIFIERS=none emacs</pre>
          <p>In order to adjust the command executed by the Debian menu, place customized configuration in "<code class="literal">/etc/menu/</code>" following method described in "<code class="literal">/usr/share/doc/menu/html</code>".</p>
        </div>
      </div>
      <div class="section">
        <div class="titlepage">
          <div>
            <div>
              <h2 class="title"><a id="_the_display_output"/>8.2. The display output</h2>
            </div>
          </div>
        </div>
        <p>Linux console can only display limited characters.  (You need to use special terminal program such as <span class="citerefentry"><span class="refentrytitle">jfbterm</span>(1)</span> to display non-European languages on the non-X console.)</p>
        <p>X Window can display any characters in the UTF-8 as long as required font data exists. (The encoding of the original font data is taken care by the X Window System and transparent to the user.)</p>
      </div>
      <div class="section">
        <div class="titlepage">
          <div>
            <div>
              <h2 class="title"><a id="_the_locale"/>8.3. The locale</h2>
            </div>
          </div>
        </div>
        <p>The following focuses on the locale for applications run under X Window environment started from <span class="citerefentry"><span class="refentrytitle">gdm3</span>(1)</span>.</p>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_basics_of_encoding"/>8.3.1. Basics of encoding</h3>
              </div>
            </div>
          </div>
          <p>The environment variable "<code class="literal">LANG=xx_YY.ZZZZ</code>" sets the locale to language code "<code class="literal">xx</code>", country code "<code class="literal">yy</code>", and encoding "<code class="literal">ZZZZ</code>" (see <a class="xref" href="ch01.en.html#_the_literal_lang_literal_variable" title="1.5.2. The &quot;$LANG&quot; variable">Section 1.5.2, “The "<code class="literal">$LANG</code>" variable”</a>).</p>
          <p>The current Debian system normally sets the locale as "<code class="literal">LANG=xx_YY.UTF-8</code>".  This uses the <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> encoding with the <a class="ulink" href="http://en.wikipedia.org/wiki/Unicode">Unicode</a> character set. This <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> encoding system is a multibyte code system and uses code points smartly. The <a class="ulink" href="http://en.wikipedia.org/wiki/ASCII">ASCII</a> data, which consist only with 7-bit range codes, are always valid UTF-8 data consisting only with 1 byte per character.</p>
          <p>The previous Debian system used to set the locale as "<code class="literal">LANG=C</code>" or "<code class="literal">LANG=xx_YY</code>" (without "<code class="literal">.UTF-8</code>").</p>
          <div class="itemizedlist">
            <ul class="itemizedlist">
              <li class="listitem">
                <p>
The <a class="ulink" href="http://en.wikipedia.org/wiki/ASCII">ASCII</a> character set is used for "<code class="literal">LANG=C</code>" or "<code class="literal">LANG=POSIX</code>".
</p>
              </li>
              <li class="listitem">
                <p>
The traditional encoding system in Unix is used for "<code class="literal">LANG=xx_YY</code>".
</p>
              </li>
            </ul>
          </div>
          <p>Actual traditional encoding system used for "<code class="literal">LANG=xx_YY</code>" can be identified by checking "<code class="literal">/usr/share/i18n/SUPPORTED</code>".  For example, "<code class="literal">en_US</code>" uses "<code class="literal">ISO-8859-1</code>" encoding and "<code class="literal">fr_FR@euro</code>" uses "<code class="literal">ISO-8859-15</code>" encoding.</p>
          <div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;">
            <table border="0" summary="Tip">
              <tr>
                <td rowspan="2" align="center" valign="top">
                  <img alt="[Tip]" src="images/tip.png"/>
                </td>
                <th align="left">Tip</th>
              </tr>
              <tr>
                <td align="left" valign="top">
                  <p>For meaning of encoding values, see <a class="xref" href="ch11.en.html#list-of-encoding-values" title="Table 11.2. List of encoding values and their usage">Table 11.2, “List of encoding values and their usage”</a>.</p>
                </td>
              </tr>
            </table>
          </div>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_rationale_for_utf_8_locale"/>8.3.2. Rationale for UTF-8 locale</h3>
              </div>
            </div>
          </div>
          <p>The <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> encoding is the modern and sensible text encoding system for I18N and enables to represent <a class="ulink" href="http://en.wikipedia.org/wiki/Unicode">Unicode</a> characters, i.e., practically all characters known to human. <span class="strong"><strong>UTF</strong></span> stands for Unicode Transformation Format (UTF) schemes.</p>
          <p>I recommend to use <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> locale for your desktop, e.g.,  "<code class="literal">LANG=en_US.UTF-8</code>".  The first part of the locale determines messages presented by applications.  For example, <span class="citerefentry"><span class="refentrytitle">gedit</span>(1)</span> (text editor for the GNOME Desktop) under "<code class="literal">LANG=fr_FR.UTF-8</code>" locale can display and edit Chinese character text data while presenting menus in French, as long as required fonts and input methods are installed.</p>
          <p>I also recommend to set the locale only using the "<code class="literal">$LANG</code>" environment variable. I do not see much benefit of setting a complicated combination of "<code class="literal">LC_*</code>" variables (see <span class="citerefentry"><span class="refentrytitle">locale</span>(1)</span>) under UTF-8 locale.</p>
          <p>Even plain English text may contain non-ASCII characters, e.g. left and right quotation marks are not available in ASCII.</p>
          <pre class="screen">“double quoted text”
‘single quoted text’</pre>
          <p>When <a class="ulink" href="http://en.wikipedia.org/wiki/ASCII">ASCII</a> plain text data is converted to <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> one, it has exactly the same content and size as the original ASCII one.  So you loose nothing by deploying UTF-8 locale.</p>
          <p>Some programs consume more memory after supporting I18N.  This is because they are coded to use <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-32/UCS-4">UTF-32(UCS4)</a> internally to support Unicode for speed optimization and consume 4 bytes per each ASCII character data independent of locale selected.  Again, you loose nothing by deploying UTF-8 locale.</p>
          <p>The vendor specific old non-UTF-8 encoding systems tend to have minor but annoying differences on some characters such as graphic ones for many countries.  The deployment of the UTF-8 system by the modern OSs practically solved these conflicting encoding issues.</p>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_the_reconfiguration_of_the_locale"/>8.3.3. The reconfiguration of the locale</h3>
              </div>
            </div>
          </div>
          <p>In order for the system to access a particular locale, the locale data must be compiled from the locale database. (The Debian system does <span class="strong"><strong>not</strong></span> come with all available locales pre-compiled unless you installed the <code class="literal">locales-all</code> package.) The full list of supported locales available for compiling are listed in "<code class="literal">/usr/share/i18n/SUPPORTED</code>". This lists all the proper locale names.  The following lists all the available UTF-8 locales already compiled to the binary form.</p>
          <pre class="screen">$ locale -a | grep utf8</pre>
          <p>The following command execution reconfigures the <code class="literal">locales</code> package.</p>
          <pre class="screen"># dpkg-reconfigure locales</pre>
          <p>This process involves 3 steps.</p>
          <div class="orderedlist">
            <ol class="orderedlist">
              <li class="listitem">
                <p>
Update the list of available locales
</p>
              </li>
              <li class="listitem">
                <p>
Compile them into the binary form
</p>
              </li>
              <li class="listitem">
                <p>
Set the system wide default locale value in the "<code class="literal">/etc/default/locale</code>" for use by PAM (see <a class="xref" href="ch04.en.html#_pam_and_nss" title="4.5. PAM and NSS">Section 4.5, “PAM and NSS”</a>)
</p>
              </li>
            </ol>
          </div>
          <p>The list of available locale should include "<code class="literal">en_US.UTF-8</code>" and all the interesting languages with "<code class="literal">UTF-8</code>".</p>
          <p>The recommended default locale is "<code class="literal">en_US.UTF-8</code>" for US English.  For other languages, please make sure to chose locale with "<code class="literal">UTF-8</code>".  Any one of these settings can handle any international characters.</p>
          <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
            <table border="0" summary="Note">
              <tr>
                <td rowspan="2" align="center" valign="top">
                  <img alt="[Note]" src="images/note.png"/>
                </td>
                <th align="left">Note</th>
              </tr>
              <tr>
                <td align="left" valign="top">
                  <p>Although setting locale to "<code class="literal">C</code>" uses US English message, it handles only ASCII characters.</p>
                </td>
              </tr>
            </table>
          </div>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_the_value_of_the_literal_lang_literal_environment_variable"/>8.3.4. The value of the "<code class="literal">$LANG</code>" environment variable</h3>
              </div>
            </div>
          </div>
          <p>The value of the "<code class="literal">$LANG</code>" environment variable is set and changed by many applications.</p>
          <div class="itemizedlist">
            <ul class="itemizedlist">
              <li class="listitem">
                <p>
Set initially by the PAM mechanism of <span class="citerefentry"><span class="refentrytitle">login</span>(1)</span> for the local Linux console programs
</p>
              </li>
              <li class="listitem">
                <p>
Set initially by the PAM mechanism of the display manager for all X programs
</p>
              </li>
              <li class="listitem">
                <p>
Set initially by the PAM mechanism of <span class="citerefentry"><span class="refentrytitle">ssh</span>(1)</span> for the remote console programs
</p>
              </li>
              <li class="listitem">
                <p>
Changed by some display manager such as <span class="citerefentry"><span class="refentrytitle">gdm3</span>(1)</span> for all X programs
</p>
              </li>
              <li class="listitem">
                <p>
Changed by the X session startup code via "<code class="literal">~/.xsessionrc</code>" for all X programs (<code class="literal">lenny</code> feature)
</p>
              </li>
              <li class="listitem">
                <p>
Changed by the shell startup code, e.g. "<code class="literal">~/.bashrc</code>", for all console programs
</p>
              </li>
            </ul>
          </div>
          <div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;">
            <table border="0" summary="Tip">
              <tr>
                <td rowspan="2" align="center" valign="top">
                  <img alt="[Tip]" src="images/tip.png"/>
                </td>
                <th align="left">Tip</th>
              </tr>
              <tr>
                <td align="left" valign="top">
                  <p>It is a good idea to install system wide default locale as "<code class="literal">en_US.UTF-8</code>" for maximum compatibility.</p>
                </td>
              </tr>
            </table>
          </div>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_specific_locale_only_under_x_window"/>8.3.5. Specific locale only under X Window</h3>
              </div>
            </div>
          </div>
          <p>You can chose specific locale only under X Window irrespective of your system wide default locale using PAM customization (see <a class="xref" href="ch04.en.html#_pam_and_nss" title="4.5. PAM and NSS">Section 4.5, “PAM and NSS”</a>) as follows.</p>
          <p>This environment should provide you with your best desktop experience with stability.  You have access to the functioning character terminal with readable messages even when the X Window System is not working.  This becomes essential for languages which use non-roman characters such as Chinese, Japanese, and Korean.</p>
          <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
            <table border="0" summary="Note">
              <tr>
                <td rowspan="2" align="center" valign="top">
                  <img alt="[Note]" src="images/note.png"/>
                </td>
                <th align="left">Note</th>
              </tr>
              <tr>
                <td align="left" valign="top">
                  <p>There may be another way available as the improvement of X session manager package but please read following as the generic and basic method of setting the locale.  For <span class="citerefentry"><span class="refentrytitle">gdm3</span>(1)</span>, I know you can select the locale of X session via its memu.</p>
                </td>
              </tr>
            </table>
          </div>
          <p>The following line defines file location of the language environment in the PAM configuration file, such as "<code class="literal">/etc/pam.d/gdm3</code>.</p>
          <pre class="screen">auth    required        pam_env.so read_env=1 envfile=/etc/default/locale</pre>
          <p>Change this to the following.</p>
          <pre class="screen">auth    required        pam_env.so read_env=1 envfile=/etc/default/locale-x</pre>
          <p>For Japanese, create a "<code class="literal">/etc/default/locale-x</code>" file with "<code class="literal">-rw-r--r-- 1 root root</code>" permission containing the following.</p>
          <pre class="screen">LANG="ja_JP.UTF-8"</pre>
          <p>Keep the default "<code class="literal">/etc/default/locale</code>" file for other programs as the the following.</p>
          <pre class="screen">LANG="en_US.UTF-8"</pre>
          <p>This is the most generic technique to customize locale and makes the menu selection dialog of <span class="citerefentry"><span class="refentrytitle">gdm3</span>(1)</span> itself to be localized.</p>
          <p>Alternatively for this case, you may simply change locale using the "<code class="literal">~/.xsessionrc</code>" file.</p>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_filename_encoding"/>8.3.6. Filename encoding</h3>
              </div>
            </div>
          </div>
          <p>For cross platform data exchanges (see <a class="xref" href="ch10.en.html#_removable_storage_device" title="10.1.7. Removable storage device">Section 10.1.7, “Removable storage device”</a>), you may need to mount some filesystem with particular encodings.  For example, <span class="citerefentry"><span class="refentrytitle">mount</span>(8)</span> for <a class="ulink" href="http://en.wikipedia.org/wiki/File_Allocation_Table">vfat filesystem</a> assumes <a class="ulink" href="http://en.wikipedia.org/wiki/Code_page_437">CP437</a> if used without option.  You need to provide
explicit mount option to use <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> or <a class="ulink" href="http://en.wikipedia.org/wiki/Code_page_932">CP932</a> for filenames.</p>
          <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
            <table border="0" summary="Note">
              <tr>
                <td rowspan="2" align="center" valign="top">
                  <img alt="[Note]" src="images/note.png"/>
                </td>
                <th align="left">Note</th>
              </tr>
              <tr>
                <td align="left" valign="top">
                  <p>When auto-mounting a hot-pluggable USB memory stick under modern desktop environment such as GNOME, you may provide such mount option by right clicking the icon on the desktop, click "Drive" tab, click to expand "Setting", and entering "utf8" to "Mount options:".  The next time this memory stick is mounted, mount with UTF-8 is enabled.</p>
                </td>
              </tr>
            </table>
          </div>
          <div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
            <table border="0" summary="Note">
              <tr>
                <td rowspan="2" align="center" valign="top">
                  <img alt="[Note]" src="images/note.png"/>
                </td>
                <th align="left">Note</th>
              </tr>
              <tr>
                <td align="left" valign="top">
                  <p>If you are upgrading system or moving disk drives from older non-UTF-8 system, file names with non-ASCII characters may be encoded in the historic and deprecated encodings such as <a class="ulink" href="http://en.wikipedia.org/wiki/ISO/IEC_8859-1">ISO-8859-1</a> or <a class="ulink" href="http://en.wikipedia.org/wiki/Extended_Unix_Code">eucJP</a>.  Please seek help of text conversion tools to convert them to <a class="ulink" href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a>. See <a class="xref" href="ch11.en.html#_text_data_conversion_tools" title="11.1. Text data conversion tools">Section 11.1, “Text data conversion tools”</a>.</p>
                </td>
              </tr>
            </table>
          </div>
          <p><a class="ulink" href="http://en.wikipedia.org/wiki/Samba_(software)">Samba</a> uses Unicode for newer clients (Windows NT, 200x, XP) but uses <a class="ulink" href="http://en.wikipedia.org/wiki/Code_page_850">CP850</a> for older clients (DOS and Windows 9x/Me) as default.  This default for older clients can be changed using "<code class="literal">dos charset</code>" in the "<code class="literal">/etc/samba/smb.conf</code>" file, e.g., to <a class="ulink" href="http://en.wikipedia.org/wiki/Code_page_932">CP932</a> for Japanese.</p>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_localized_messages_and_translated_documentation"/>8.3.7. Localized messages and translated documentation</h3>
              </div>
            </div>
          </div>
          <p>Translations exist for many of the text messages and documents that are displayed in the Debian system, such as error messages, standard program output, menus, and manual pages.  <a class="ulink" href="http://en.wikipedia.org/wiki/Gettext">GNU gettext(1) command tool chain</a> is used as the backend tool for most translation activities.</p>
          <p><span class="citerefentry"><span class="refentrytitle">aptitude</span>(8)</span> lists under "Tasks" → "Localization" provide extensive list of useful binary packages which add localized messages to applications and provide translated documentation.</p>
          <p>For example, you can obtain the localized message for manpage by installing the <code class="literal">manpages-&lt;LANG&gt;</code> package. To read the Italian-language manpage for &lt;programname&gt; from "<code class="literal">/usr/share/man/it/</code>", execute as the following.</p>
          <pre class="screen">LANG=it_IT.UTF-8 man &lt;programname&gt;</pre>
        </div>
        <div class="section">
          <div class="titlepage">
            <div>
              <div>
                <h3 class="title"><a id="_effects_of_the_locale"/>8.3.8. Effects of the locale</h3>
              </div>
            </div>
          </div>
          <p>The sort order of characters with <span class="citerefentry"><span class="refentrytitle">sort</span>(1)</span> is affected by the language choice of the locale. Spanish and English locale sort differently.</p>
          <p>The date format of <span class="citerefentry"><span class="refentrytitle">ls</span>(1)</span> is affected by the locale.  The date format of "<code class="literal">LANG=C ls -l</code>" and "<code class="literal">LANG=en_US.UTF-8</code>" are different (see <a class="xref" href="ch09.en.html#_customized_display_of_time_and_date" title="9.2.5. Customized display of time and date">Section 9.2.5, “Customized display of time and date”</a>).</p>
          <p>Number punctuation are different for locales.  For example, in English locale, one thousand one point one is displayed as "<code class="literal">1,000.1</code>" while in German locale, it is displayed as "<code class="literal">1.000,1</code>".  You may see this difference in spreadsheet program.</p>
        </div>
      </div>
    </div>
    <div class="navfooter">
      <hr/>
      <table width="100%" summary="Navigation footer">
        <tr>
          <td align="left"><a accesskey="p" href="ch07.en.html"><img src="images/prev.gif" alt="Prev"/></a> </td>
          <td align="center"> </td>
          <td align="right"> <a accesskey="n" href="ch09.en.html"><img src="images/next.gif" alt="Next"/></a></td>
        </tr>
        <tr>
          <td align="left" valign="top">Chapter 7. The X Window System </td>
          <td align="center">
            <a accesskey="h" href="index.en.html">
              <img src="images/home.gif" alt="Home"/>
            </a>
          </td>
          <td align="right" valign="top"> Chapter 9. System tips</td>
        </tr>
      </table>
    </div>
  </body>
</html>