This file is indexed.

/usr/share/doc/python-gamera.toolkits.ocr/html/ocr.html is in python-gamera.toolkits.ocr 1.2.2-3.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.13.1: http://docutils.sourceforge.net/" />
<title>OCR</title>
<link rel="stylesheet" href="default.css" type="text/css" />
</head>
<body>
<div class="document" id="ocr">
<h1 class="title">OCR</h1>

<p><strong>Last modified</strong>:</p>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="simple">
<li><a class="reference internal" href="#segmentation" id="id1">segmentation</a><ul>
<li><a class="reference internal" href="#bbox-mcmill" id="id2"><tt class="docutils literal">bbox_mcmill</tt></a></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="segmentation">
<h1><a class="toc-backref" href="#id1">segmentation</a></h1>
<div class="section" id="bbox-mcmill">
<h2><a class="toc-backref" href="#id2"><tt class="docutils literal">bbox_mcmill</tt></a></h2>
<p>[object] <strong>bbox_mcmill</strong> ([object <em>glyphs</em>], float <em>section_search_size</em> = 1.00, float <em>noise_mltplk</em> = 1.00, float <em>large_mltplk</em> = 20.00, float <em>stdev_mltplk</em> = 5.00)</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field"><th class="field-name">Operates on:</th><td class="field-body"><tt class="docutils literal">Image</tt> [OneBit]</td>
</tr>
<tr class="field"><th class="field-name">Returns:</th><td class="field-body">[object]</td>
</tr>
<tr class="field"><th class="field-name">Category:</th><td class="field-body">OCR/segmentation</td>
</tr>
<tr class="field"><th class="field-name">Defined in:</th><td class="field-body">bbox_merging_mcmillan.py</td>
</tr>
<tr class="field"><th class="field-name">Author:</th><td class="field-body">Robert Butz, Karl MacMillan</td>
</tr>
</tbody>
</table>
<p>Returns the textlines from image as connected components.
The segmentation method is adapted from McMillan's segmentation method
in roman_text.py. It allows a more individual segmentation through
parameterization.</p>
<p>Options:</p>
<blockquote>
<dl class="docutils">
<dt><em>glyphs</em>:</dt>
<dd>This list can be build out of a <tt class="docutils literal">cc_analysis</tt>. On default, this
parameter is blank, which will cause the function to call
<tt class="docutils literal">cc_analysis</tt> itself.</dd>
<dt><em>section_search_size</em></dt>
<dd>This optional parameter adjusts the calculated avg_glyph_size by
multipling its value (default=1).</dd>
<dt><em>noise_mltplk</em></dt>
<dd>With this optional parameter one can adjust the noise_recognition
rate independently from the calculated avg_glyph_size (default = 1).
Values greater than 1 let the noise_removal detect bigger noise
(but maybe even glyphs). Chose smaller values to avoid assigning
small glyphs to noise.</dd>
<dt><em>large_mltplk</em></dt>
<dd>Analog to noise_mltplk one can set this parameter to manipulate the
recognition of very large ccs according to the avg_glyph_size
(default=20). Higher values lead to a better acceptance of
above-average ccs. Beneficial, for example for big capital
initials at the beginning of paragraphs such as seen in bibles.</dd>
<dt><em>stdev_mltplk</em></dt>
<dd>This parameter affects the line finding algorithm by excluding
abnormally tall glyphs (default=5). The standard deviation will
be calculated and multiplied by this parameter.</dd>
</dl>
</blockquote>
</div>
</div>
</div>
<div class="footer">
<hr class="footer" />
<span class="raw-html"><div style="text-align:right;">For contact information, see <a href="http://gamera.informatik.hsnr.de/contact.html">http://gamera.informatik.hsnr.de/contact.html</a></div></span>
</div>
</body>
</html>