This file is indexed.

/usr/share/doc/libgnome-speech-dev/gnome-speech.html is in libgnome-speech-dev 1:0.4.25-5.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
<html> <head> <title>GNOME Speech</title> </head> <body> <h1>Section
1: Introduction</h1> <h2>1.1 Overview</h2>
<p>
	GNOME Speech aims to be a general interface to various
	text-to-speech engines for the GNOME desktop.  It allows the
	simple speaking of text, as well as control over various
	speech parameters such as speech pitch, rate, and volume.  It
	uses ORBit2 and Bonobo to facilitate the location and
	activation of, and communication with, the various speech
	drivers.
</p>
<h2>1.2 Justification for GNOME Speech</h2>
<p>
	There are many different text to speech hardware and software
	products currently available.  Some text-to-speech
	synthesizers are software libraries to which an application
	can be linked, and call commands to produce speech.  Some
	text-to-speech engines are hardware devices, to which commands
	and text to be spoken are sent via the serial, USB, or
	parallel port.  Still others are applications to which text
	and commands can be piped.  In addition, although there are
	standard markup languages that specify how commands to change
	speech parameters can be embedded within text, not all engines
	support the same languages, and some don't support any markup
	languages at all.
</p>
<p>
	It is for these reasons that a standard API for communicating
	with various text-to-speech engines is needed.  This is where
	GNOME Speech becomes useful.  It hides the differences in the
	implementation, API, and markups used by the various engines
	by defining an API that accommodates all the standard features
	of most speech engines, and some of the more obscure features
	supported by some engines.  GNOME Speech driver
	implementations proxy the standard API, which is defined in
	IDL, to the various commands and markup language of a
	particular engine.
</p>
<p>
	This drastically reduces the development time required for
	applications that want to produce speech with a wide variety of
	engines.  The application developer no longer must focus on the
	internals of individual speech engines, but rather can focus on the
	core purpose of the application, and simply interface with multiple
	engines using the single GNOME Speech API.  Other operating systems
	including Microsoft Windows, and Mac OS, include a speech API that in
	many cases supports both text to speech and voice recognition.  GNOME
	Speech aims to eventually provide a similar speech API for the GNOME
	desktop.  The initial version of GNOME Speech supports only text to
	speech, but work is currently underway to define a new GNOME Speech
	API that will support both text to speech and voice recognition (see
	section 5).
</p>
<h2>1.3 Sample Uses of GNOME Speech</h2>
<p>
	GNOME Speech was originally designed as a part of the
	requirements of the Gnopernicus project, a project which aims
	to provide a full-featured screen reader for GNOME.  This
	project, which is under the general umbrella of the GNOME
	Accessibility Project, provides speech and Braille feedback to
	blind and low vision users about current applications and
	windows on the screen.  GNOME Speech could also be used in any
	number of other accessibility-related contexts, including
	assistive technologies which highlight and speak on-screen
	text for users with learning disabilities, and augmentative
	communication aids.
</p>
<h2>1.4 What Speech Engines are Currently Supported</h2>
<p>
Source code for GNOME Speech drivers supporting the following engines
is currently provided in CVS:
</p>
<table>
<tr>
<th>Engine Name</th>
<th>Platforms Supported</th>
<th>Comments</th>
</tr>
<tr>
<td>eSpeak</td>
<td>Linux (other platforms?)</td>
</tr>
<tr>
<td>Festival</td>
<td>Linux/Solaris</td>
</tr>
<tr>
<td>FreeTTS</td>
<td>Linux/Solaris</td>
<td>Requires at least J2SDK 1.4.1 and java-access-bridge in order to
build driver</td>
</tr>
<tr>
<td>Speech Dispatcher</td>
<td>Linux</td>
</tr>
<tr>
<td>IBM ViaVoice TTS</td>
<td>Linux Only</td>
<td>No longer availableon the web.</td>
</tr>
<tr>
<td>Eloquence</td>
<td>Linux/Solaris</td>
</tr>
<tr>
<td>DECTalk Software</td>
<td>Linux Only</td>
<td>$50 download from
<a href = "http://www.fonix.com">Fonix</a>
</td>
</tr>
<tr>
<td>Cepstral</td>
<td>Linux/Solaris</td>
<td>Available as a $29 download from
<a href = "http://www.cepstral.com">Cepstral</a>
</td>
</tr>
</table>
<h1>Section 2. Overview</h1>
<h2>2.1 Prerequisites</h2>
<p>
This paper assumes at least a minimal understanding of the Glib object
system, Bonobo, Bonobo-activation, and ORBit2.  A list of useful
resources in learning about these technologies and their applications
follows:
</p>
<ul>
<li>
<a href = "http://developer.gnome.org/doc/API/2.0/glib/index.html">
Glib API Reference Manual
</a>
</li>
<li>
<a href = "http://developer.gnome.org/doc/API/2.0/gobject/index.html">
GObject API Reference Manual
</a>
</li>
<li>
<a href = "http://developer.gnome.org/doc/API/2.0/bonobo-activation/index.html">
Bonobo Activation API Reference Manual
</a>
</li>
<li>
<a href = "http://developer.gnome.org/doc/API/2.0/libbonobo/index.html">
Libbonobo API Reference Manual
</a>
</li>
<li>
<a href = "http://www.dunkelhain.de/docs/>
Short Gobject/Glib tutorial
</a>
</li>
<li>
<a href = "http://www.gtk.org/tutorial/">
GTK+ 2.0 Tutorial (includes some information about Glib)
</a>
</li>
<li>
<a href = "http://www.106.ibm.com/developerworks/webservices/library/co-bnbo1.html">
Great three-part Bonobo tutorial written by Bonobo's lead developer, Michael
Meeks, for IBM Developer Works
</a>
</li>
</ul>
<h2>2.1 The role of Bonobo</h2>
<p>
GNOME Speech has the following design requirements:
</p>
<ul>
<li>Clients should be able to get a list of installed drivers</li>
<li>Clients should be able to get some amount of information about
supported features of installed drivers</li>
<li>Driver implementations should be object-oriented as to facilitate code
re-use</li>
<li>It should be possible to write drivers in any
language</li>
</ul>
<p>
For these reasons, the combination of Bonobo and Bonobo-activation was
chosen as the IPC and object framework for GNOME Speech.
</p>
<h2>2.2 Querying for information about installed drivers</h2>
<p>
GNOME Speech drivers are standard Bonobo servers, so the standard
Bonobo-activation calls are used to query for information about
currently installed GNOME Speech drivers.  Querying for support of the
interface named GNOME_Speech_SynthesisDriver will return the list of
all GNOME Speech drivers which are installed on the system.  An
application can also query for the interface named
GNOME_Speech_SpeechCallback to get a list of GNOME Speech drivers
which are capable of providing speech callback information.
</p>
<h1>3. Implementing a GNOME Speech driver</h1>
<h2>3.1 Checklist and general considerations</h2>
<p>
Some things to consider before implementing a GNOME Speech driver:
</p>
<ul>
<li>
Is the engine for which the driver is to be written proprietary?  If
so, is it possible to write an Open Source GNOME Speech Driver if
desired?
</li>
<li>
Does the engine provide speech callbacks (I.E.,
does it provide status information about current speech progress?
</li>
<li>
Does the engine require a multi-threaded client?  If so,
how will this be integrated into the Glib main loop?  </li> <li> Does
the engine provide its own audio output?  If not, how will you get the
audio to the soundcard?  (Note that it is more difficult to provide
accurate callback information for engines that do not produce their
own audio output).
</li>
</ul>
<h2>3.2 Interfaces and Data Structures</h2>
<p>
At a minimum, a GNOME Speech driver must support two interfaces, the
SynthesisDriver and Speaker interfaces.
</p>
<h3>3.2.1 The SynthesisDriver Interface</h3>
<p>
The SynthesisDriver interface provides basic information about the
text-to-speech engine and the GNOME speech driver, and allows creation
of Speaker objects (instances of the text-to-speech engine).  The
interface is defined as follows:
</p>
<pre>
interface SynthesisDriver : Bonobo::Unknown {
	readonly attribute string driverName;
	readonly attribute string synthesizerName;
	readonly attribute string driverVersion;
	readonly attribute string synthesizerVersion;

	boolean driverInit ();
	boolean isInitialized ();

	VoiceInfoList getVoices (in VoiceInfo voice_spec);
	VoiceInfoList getAllVoices ();

	Speaker createSpeaker (in VoiceInfo voice_spec);
};
</pre>
<p>
The VoiceInfo structure allows a client to specify information about a
voice, such as its name, language, or gender.  The client can then
perform queries of the driver to determine what voices it supports by
filling in members of the VoiceInfo structure.  The getVoices function
should return all voices supported by the driver which meet all the
requirements specified in the VoiceInfo structure passed to it. The
getAllVoices function should return the VoiceInfo structures for all
voices supported by the driver.
</p>
<p>
the createSpeaker function should return a Speaker object.  This object
is created using the first voice which meets the requirements specified
in the provided VoiceInfo structure.
</p>
<h3>3.2.2 The Speaker Interface</h3>
<p>
A GNOME Speech driver's implementation of the Speaker interface is the
part of the driver which actually controls the text-to-speech
engine. The interface is defined as follows:
</p>
<pre>
interface Speaker : Bonobo::Unknown {

	ParameterList getSupportedParameters ();
	string getParameterValueDescription (in string name,
	in double value);
	double getParameterValue (in string name);
	boolean setParameterValue (in string name, in double value);
    
	long say (in string text);
	boolean stop ();
	boolean isSpeaking ();
	void wait ();
    
	boolean registerSpeechCallback (in SpeechCallback callback);
};
</pre>
<p>
A ParameterList is a sequence of Parameter structures.  The Parameter
structure is defined as follows:
</p>
<pre>
struct Parameter {
	string name;
	double min;
	double current;
	double 	max;
	boolean enumerated;
};
</pre>
<p>
Every parameter has a unique name, and a minimum, current, and maximum
value.  These basic parameters allow for setting parameters with
numeric values such as speaking rate in words per minute, or the
baseline pitch of the voice in Hz.  The getParameterValue returns the
current value of the parameter whose name is specified, and the
setParameterValue function sets the current value of the parameter
whose name is specified.  (Note that if the new value is out of range,
setParameterValue should return FALSE).
</p>
<p>
GNOME Speech also defines a mechanism of describing parameters which
are not necessarily numeric. The getParameterValue and
setParameterValue functions are still used to get and set the values
of these enumerated parameters.  However, the getValueDescription
function can be used to retrieve a text description of the various
values within the parameter's range.
</p>
<p>
While standard names for parameters are not strictly enforced, some
recommendations are listed here:
</p>
<table>
<tr>
<th>Parameter Name</th>
<th>Description</th>
</tr>
<tr>
<td>rate</td>
<td>Speaking Rate in Words Per Minute</td>
</tr>
<tr>
<td>pitch</td>
<td>Baseline Speaking Pitch in Hz.</td> 
</tr>
<tr>
<td>Volume</td>
<td>Speaking Volume (recommended range is 0 - 100)</td>
</tr>
</table>
<p>
The say function causes the driver to speak the specified text.  The
driver should return a unique long identifying the particular string
to be used for future reference in handling speech callbacks. The
drivers should return immediately, and not wait until speech is
finished before returning.
</p>
<p>
The stop function stops speech immediately and flushes anything in the
text-to-speech engine's queue. The isSpeaking function returns true if
the engine is currently speaking and false if not. The wait method
returns only after any current speech has finished.
</p>
<h3>3.2.3 SpeechCallback</h3>
<p>
The SpeechCallback interface is actually not implemented by the GNOME
Speech driver, but rather by the GNOME Speech client.  This is the
interface that GNOME Speech drivers use to communicate information
about speech progress to their clients.  The SpeechCallback interface
defines only one function, notify, which takes the key identifying the
string, the type of the callback, and possibly a text offset.  GNOME
Speech defines three types of callbacks, speech started, speech
finished, and index.  If a callback of type index is received, the key
identifies the particular string being spoken, and the offset
indicates the offset of the last character that has been spoken.
</p>
<h2>3.3 Supporting speech callbacks</h2>
<p>
Support for speech callbacks can be the most difficult part of a GNOME
Speech driver to implement.  The following are some suggestions to
make providing speech callbacks easier.
</p>
<p>
If the engine for which the driver is written does not support speech
callbacks, the driver implementer should at least do the following:
</p>
<ul>
<li>Ensure that the GNOME Speech driver's .server file indicates that
the driver does not support the GNOME_Speech_SpeechCallback
interface.</li>
<li>Ensure that the speaker's implementation of the registerCallback
function returns FALSE.
</li>
</ul>
<p>
To provide support for callbacks, a driver's implementation of the
Speaker interface must provide at least the following:
</p>
<ul>
<li>Implement a callback listener that listens to the engine specific
callbacks.</li>
</ul>
<h1>Section 4: Implementing a GNOME Speech Client</h1>
<h2>4.1 Proper setup</h2>
<p>
An application wanting to produce speech using GNOME Speech should
first obtain a list of GNOME Speech drivers which are installed on the
system.  If no callbacks are desired, then the application need only
request a list of Bonobo servers implementing the
GNOME_Speech_SynthesisDriver interface.  If callbacks are required,
then the application should request a list of Bonobo servers that
implement GNOME_Speech_SpeechCallback.  Bonobo-activation is used to
obtain this list.
</p>
<p>
	Once the application has a list of available speech drivers,
	it uses Bonobo-activation to activate one of them.  The object
	that is returned by the bonobo_activation_activate call is an
	object which implements the GNOME_Speech_SynthesisDriver
	interface.
</p>
<p>
Before calling any functions on the object, the application should
call the driverInit function.  This function returns true if the
driver was successfully initialized, false otherwise.  If the
driverInit function returns false, then the application should not
attempt to call any other functions on the object.
</p>
<p>
Ghe application can call the getDriverName, getDriverVersion,
getSynthesizerName, and getSynthesizerVersion functions to ddetermine
the name and version of the GNOME Speech driver and the underlying
text-to-speech engine.
</p>
<p>
The application can call createSpeaker, which creates and returns an
object implementing the GNOME_Speech_Speaker interface.  This
interface can be used to speak text and set various speech
characteristics such as speaking rate and pitch.
</p>
<h2>4.2 Handling Speech Callbacks</h2>
<p>
In order for an application to receive notifications about speech
progress and status, it must contain an object that implements the
GNOME_Speech_SpeechCallback interface.  Once a speaker is created, the
application should register it's callback interface with the speaker
using the registerSpeechCallback function
</p>
<h1>5. The Future of GNOME Speech</h1> <h2>5.1 GNOME Speech 1.0</h2>
<p>
Work is underway to totally rewrite the GNOME Speech API in
preparation for a GNOME Speech 1.0 release.  The major improvements
planned for 1.0 include:
</p>
<ul>
<li>API heavily influenced by the Java Speech API</li>
<li>API for speech recognition will be included</li>
<li>A markup language for marking up text with information about speech
characteristics will be supported.</li> 
</ul>
<h2>5.2. BUS and KDE interoperability</h2>
<p>
Work is also underway in prototyping a system based on D-Bus rather
than Bonobo.  Under this system, D-Bus would replace Bonobo as the
underlying IPC mechanism.  This would better facilitate
interoperability with KDE.
</p>
</body>
</html>