/usr/share/doc/python-gamera.toolkits.ocr/html/ocr.html is in python-gamera.toolkits.ocr 1.2.2-3.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | <?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.13.1: http://docutils.sourceforge.net/" />
<title>OCR</title>
<link rel="stylesheet" href="default.css" type="text/css" />
</head>
<body>
<div class="document" id="ocr">
<h1 class="title">OCR</h1>
<p><strong>Last modified</strong>:</p>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="simple">
<li><a class="reference internal" href="#segmentation" id="id1">segmentation</a><ul>
<li><a class="reference internal" href="#bbox-mcmill" id="id2"><tt class="docutils literal">bbox_mcmill</tt></a></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="segmentation">
<h1><a class="toc-backref" href="#id1">segmentation</a></h1>
<div class="section" id="bbox-mcmill">
<h2><a class="toc-backref" href="#id2"><tt class="docutils literal">bbox_mcmill</tt></a></h2>
<p>[object] <strong>bbox_mcmill</strong> ([object <em>glyphs</em>], float <em>section_search_size</em> = 1.00, float <em>noise_mltplk</em> = 1.00, float <em>large_mltplk</em> = 20.00, float <em>stdev_mltplk</em> = 5.00)</p>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field"><th class="field-name">Operates on:</th><td class="field-body"><tt class="docutils literal">Image</tt> [OneBit]</td>
</tr>
<tr class="field"><th class="field-name">Returns:</th><td class="field-body">[object]</td>
</tr>
<tr class="field"><th class="field-name">Category:</th><td class="field-body">OCR/segmentation</td>
</tr>
<tr class="field"><th class="field-name">Defined in:</th><td class="field-body">bbox_merging_mcmillan.py</td>
</tr>
<tr class="field"><th class="field-name">Author:</th><td class="field-body">Robert Butz, Karl MacMillan</td>
</tr>
</tbody>
</table>
<p>Returns the textlines from image as connected components.
The segmentation method is adapted from McMillan's segmentation method
in roman_text.py. It allows a more individual segmentation through
parameterization.</p>
<p>Options:</p>
<blockquote>
<dl class="docutils">
<dt><em>glyphs</em>:</dt>
<dd>This list can be build out of a <tt class="docutils literal">cc_analysis</tt>. On default, this
parameter is blank, which will cause the function to call
<tt class="docutils literal">cc_analysis</tt> itself.</dd>
<dt><em>section_search_size</em></dt>
<dd>This optional parameter adjusts the calculated avg_glyph_size by
multipling its value (default=1).</dd>
<dt><em>noise_mltplk</em></dt>
<dd>With this optional parameter one can adjust the noise_recognition
rate independently from the calculated avg_glyph_size (default = 1).
Values greater than 1 let the noise_removal detect bigger noise
(but maybe even glyphs). Chose smaller values to avoid assigning
small glyphs to noise.</dd>
<dt><em>large_mltplk</em></dt>
<dd>Analog to noise_mltplk one can set this parameter to manipulate the
recognition of very large ccs according to the avg_glyph_size
(default=20). Higher values lead to a better acceptance of
above-average ccs. Beneficial, for example for big capital
initials at the beginning of paragraphs such as seen in bibles.</dd>
<dt><em>stdev_mltplk</em></dt>
<dd>This parameter affects the line finding algorithm by excluding
abnormally tall glyphs (default=5). The standard deviation will
be calculated and multiplied by this parameter.</dd>
</dl>
</blockquote>
</div>
</div>
</div>
<div class="footer">
<hr class="footer" />
<span class="raw-html"><div style="text-align:right;">For contact information, see <a href="http://gamera.informatik.hsnr.de/contact.html">http://gamera.informatik.hsnr.de/contact.html</a></div></span>
</div>
</body>
</html>
|