/usr/share/doc/python-pebl/html/discretizer.html is in python-pebl-doc 1.0.2-2build1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>discretizer – Discretization algorithms — Pebl v1.0.1 documentation</title>
<link rel="stylesheet" href="_static/default.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: '',
VERSION: '1.0.1',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="_static/jquery.js"></script>
<script type="text/javascript" src="_static/underscore.js"></script>
<script type="text/javascript" src="_static/doctools.js"></script>
<link rel="top" title="Pebl v1.0.1 documentation" href="index.html" />
<link rel="up" title="API Reference" href="apiref.html" />
<link rel="next" title="evaluator – Network evaluators" href="evaluator.html" />
<link rel="prev" title="data – Pebl Dataset" href="data.html" />
</head>
<body>
<div class="related">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="evaluator.html" title="evaluator – Network evaluators"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="data.html" title="data – Pebl Dataset"
accesskey="P">previous</a> |</li>
<li><a href="index.html">Pebl v1.0.1 documentation</a> »</li>
<li><a href="apiref.html" accesskey="U">API Reference</a> »</li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body">
<div class="section" id="module-pebl.discretizer">
<span id="discretizer-discretization-algorithms"></span><h1><tt class="xref py py-mod docutils literal"><span class="pre">discretizer</span></tt> – Discretization algorithms<a class="headerlink" href="#module-pebl.discretizer" title="Permalink to this headline">¶</a></h1>
<p>Currently, Pebl only includes one discretization implementation but more may
come. Discretization and other data pre-processing steps can have a big impact
on the final results.</p>
<dl class="function">
<dt id="pebl.discretizer.maximum_entropy_discretize">
<tt class="descclassname">pebl.discretizer.</tt><tt class="descname">maximum_entropy_discretize</tt><big>(</big><em>indata</em>, <em>includevars=None</em>, <em>excludevars=</em><span class="optional">[</span><span class="optional">]</span>, <em>numbins=3</em><big>)</big><a class="headerlink" href="#pebl.discretizer.maximum_entropy_discretize" title="Permalink to this definition">¶</a></dt>
<dd><p>Performs a maximum-entropy discretization of data in-place.</p>
<p>Requirements for this implementation:</p>
<blockquote>
<div><blockquote>
<div><ol class="arabic simple">
<li>Try to make all bins equal sized (maximize the entropy)</li>
<li>If datum x==y in the original dataset, then disc(x)==disc(y)
For example, all datapoints with value 3.245 discretize to 1
even if it violates requirement 1.</li>
<li>Number of bins reflects only the non-missing data.</li>
</ol>
</div></blockquote>
<p>Example:</p>
<blockquote>
<div><p>input: [3,7,4,4,4,5]
output: [0,1,0,0,0,1]</p>
<p>Note that all 4s discretize to 0, which makes bin sizes unequal.</p>
</div></blockquote>
<p>Example:</p>
<blockquote>
<div><p>input: [1,2,3,4,2,1,2,3,1,x,x,x]
output: [0,1,2,2,1,0,1,2,0,0,0,0]</p>
<p>Note that the missing data (‘x’) gets put in the bin with 0.0.</p>
</div></blockquote>
</div></blockquote>
</dd></dl>
</div>
</div>
</div>
</div>
<div class="sphinxsidebar">
<div class="sphinxsidebarwrapper">
<h4>Previous topic</h4>
<p class="topless"><a href="data.html"
title="previous chapter"><tt class="docutils literal docutils literal docutils literal"><span class="pre">data</span></tt> – Pebl Dataset</a></p>
<h4>Next topic</h4>
<p class="topless"><a href="evaluator.html"
title="next chapter"><tt class="docutils literal"><span class="pre">evaluator</span></tt> – Network evaluators</a></p>
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="_sources/discretizer.txt"
rel="nofollow">Show Source</a></li>
</ul>
<div id="searchbox" style="display: none">
<h3>Quick search</h3>
<form class="search" action="search.html" method="get">
<input type="text" name="q" />
<input type="submit" value="Go" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
<p class="searchtip" style="font-size: 90%">
Enter search terms or a module, class or function name.
</p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
>index</a></li>
<li class="right" >
<a href="py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="evaluator.html" title="evaluator – Network evaluators"
>next</a> |</li>
<li class="right" >
<a href="data.html" title="data – Pebl Dataset"
>previous</a> |</li>
<li><a href="index.html">Pebl v1.0.1 documentation</a> »</li>
<li><a href="apiref.html" >API Reference</a> »</li>
</ul>
</div>
<div class="footer">
© Copyright 2008, Abhik Shah.
Last updated on Dec 31, 2011.
Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.0.8.
</div>
</body>
</html>
|