This file is indexed.

/usr/share/help/gl/gnome-devel-demos/strings.py.page is in gnome-devel-docs 3.28.0-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
<?xml version="1.0" encoding="utf-8"?>
<page xmlns="http://projectmallard.org/1.0/" xmlns:its="http://www.w3.org/2005/11/its" xmlns:e="http://projectmallard.org/experimental/" type="guide" style="task" id="strings.py" xml:lang="gl">

<info>
  <title type="text">Strings (Python)</title>
  <link type="guide" xref="beginner.py#theory"/>
  <link type="next" xref="label.py"/>
  <revision version="0.1" date="2012-06-16" status="draft"/>

  <desc>An explanation of how to deal with strings in Python and GTK+.</desc>
  <credit type="author copyright">
    <name>Sebastian Pölsterl</name>
    <email its:translate="no">sebp@k-d-w.org</email>
    <years>2011</years>
  </credit>
  <credit type="editor">
    <name>Marta Maria Casetti</name>
    <email its:translate="no">mmcasetti@gmail.com</email>
    <years>2012</years>
  </credit>

    <mal:credit xmlns:mal="http://projectmallard.org/1.0/" type="translator copyright">
      <mal:name>Fran Dieguez</mal:name>
      <mal:email>frandieguez@gnome.org</mal:email>
      <mal:years>2012-2013.</mal:years>
    </mal:credit>
  </info>

<title>Strings</title>

<links type="section"/>

<note style="warning"><p>GNOME strongly encourages the use of Python 3 for writing applications!</p></note>

<section id="python-2">
<title>Strings in Python 2</title>

<p>Python 2 comes with two different kinds of objects that can be used to represent strings, <code>str</code> and <code>unicode</code>. Instances of <code>unicode</code> are used to express Unicode strings, whereas instances of the <code>str</code> type are byte representations (the encoded string). Under the hood, Python represents Unicode strings as either 16- or 32-bit integers, depending on how the Python interpreter was compiled.</p>

<code><![CDATA[
>>> unicode_string = u"Fu\u00dfb\u00e4lle"
>>> print unicode_string]]>
Fußbälle
</code>

<p>Unicode strings can be converted to 8-bit strings with <code>unicode.encode()</code>. Python’s 8-bit strings have a <code>str.decode()</code> method that interprets the string using the given encoding (that is, it is the inverse of the <code>unicode.encode()</code>):</p>

<code><![CDATA[
>>> type(unicode_string)
<type 'unicode'>
>>> unicode_string.encode("utf-8")
'Fu\xc3\x9fb\xc3\xa4lle'
>>> utf8_string = unicode_string.encode("utf-8")
>>> type(utf8_string)
<type 'str'>
>>> unicode_string == utf8_string.decode("utf-8")
True]]></code>

<p>Unfortunately, Python 2.x allows you to mix <code>unicode</code> and <code>str</code> if the 8-bit string happened to contain only 7-bit (ASCII) bytes, but would get <sys>UnicodeDecodeError</sys> if it contained non-ASCII values.</p>

</section>

<section id="python-3">
<title>Cadeas en Python 3</title>

<p>Since Python 3.0, all strings are stored as Unicode in an instance of the <code>str</code> type. Encoded strings on the other hand are represented as binary data in the form of instances of the bytes type. Conceptually, <code>str</code> refers to text, whereas bytes refers to data. Use <code>encode()</code> to go from <code>str</code> to <code>bytes</code>, and <code>decode()</code> to go from <code>bytes</code> to <code>str</code>.</p>

<p>In addition, it is no longer possible to mix Unicode strings with encoded strings, because it will result in a <code>TypeError</code>:</p>

<code><![CDATA[
>>> text = "Fu\u00dfb\u00e4lle"
>>> data = b" sind rund"
>>> text + data
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't convert 'bytes' object to str implicitly
>>> text + data.decode("utf-8")
'Fußbälle sind rund'
>>> text.encode("utf-8") + data
b'Fu\xc3\x9fb\xc3\xa4lle sind rund']]></code>

</section>

<section id="gtk">
<title>Unicode en GTK+</title>

<p>GTK+ uses UTF-8 encoded strings for all text. This means that if you call a method that returns a string you will always obtain an instance of the <code>str</code> type. The same applies to methods that expect one or more strings as parameter, they must be UTF-8 encoded. However, for convenience PyGObject will automatically convert any unicode instance to str if supplied as argument:</p>

<code><![CDATA[
>>> from gi.repository import Gtk
>>> label = Gtk.Label()
>>> unicode_string = u"Fu\u00dfb\u00e4lle"
>>> label.set_text(unicode_string)
>>> txt = label.get_text()
>>> type(txt)
<type 'str'>]]></code>

<p>Aínda máis:</p>

<code><![CDATA[
>>> txt == unicode_string]]></code>

<p>would return <code>False</code>, with the warning <code>__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal</code> (<code>Gtk.Label.get_text()</code> will always return a <code>str</code> instance; therefore, <code>txt</code> and <code>unicode_string</code> are not equal).</p>

<p>This is especially important if you want to internationalize your program using <link href="http://docs.python.org/library/gettext.html"><code>gettext</code></link>. You have to make sure that <code>gettext</code> will return UTF-8 encoded 8-bit strings for all languages.</p>

<p>In general it is recommended to not use <code>unicode</code> objects in GTK+ applications at all, and only use UTF-8 encoded <code>str</code> objects since GTK+ does not fully integrate with <code>unicode</code> objects.</p>

<p>String encoding is more consistent in Python 3.x because PyGObject will automatically encode/decode to/from UTF-8 if you pass a string to a method or a method returns a string. Strings, or text, will always be represented as instances of <code>str</code> only.</p>

</section>

<section id="references">
<title>References</title>

<p><link href="http://python-gtk-3-tutorial.readthedocs.org/en/latest/unicode.html">How To Deal With Strings - The Python GTK+ 3 Tutorial</link></p>

</section>

</page>