This file is indexed.

/usr/share/doc/ganeti/html/design-linuxha.html is in ganeti-doc 2.16.0~rc2-1build1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Linux HA integration &#8212; Ganeti 2.16.0~rc2 documentation</title>
    <link rel="stylesheet" href="_static/style.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    './',
        VERSION:     '2.16.0~rc2',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true,
        SOURCELINK_SUFFIX: '.txt'
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="search" title="Search" href="search.html" />
    <link rel="next" title="Submitting jobs from logical units" href="design-lu-generated-jobs.html" />
    <link rel="prev" title="Improving location awareness of Ganeti" href="design-location.html" /> 
  </head>
  <body>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="design-lu-generated-jobs.html" title="Submitting jobs from logical units"
             accesskey="N">next</a></li>
        <li class="right" >
          <a href="design-location.html" title="Improving location awareness of Ganeti"
             accesskey="P">previous</a> |</li>
        <li class="nav-item nav-item-0"><a href="index.html">Ganeti 2.16.0~rc2 documentation</a> &#187;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body" role="main">
            
  <div class="section" id="linux-ha-integration">
<h1><a class="toc-backref" href="#id2">Linux HA integration</a><a class="headerlink" href="#linux-ha-integration" title="Permalink to this headline"></a></h1>
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field-odd field"><th class="field-name">Created:</th><td class="field-body">2013-Oct-24</td>
</tr>
<tr class="field-even field"><th class="field-name">Status:</th><td class="field-body">Implemented</td>
</tr>
<tr class="field-odd field"><th class="field-name">Ganeti-Version:</th><td class="field-body">2.7.0</td>
</tr>
</tbody>
</table>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="simple">
<li><a class="reference internal" href="#linux-ha-integration" id="id2">Linux HA integration</a><ul>
<li><a class="reference internal" href="#current-state-and-shortcomings" id="id3">Current state and shortcomings</a></li>
<li><a class="reference internal" href="#proposed-changes" id="id4">Proposed changes</a><ul>
<li><a class="reference internal" href="#ganeti-ocf-agents" id="id5">Ganeti OCF agents</a></li>
<li><a class="reference internal" href="#master-role-agent" id="id6">Master role agent</a><ul>
<li><a class="reference internal" href="#future-improvements" id="id7">Future improvements</a></li>
</ul>
</li>
<li><a class="reference internal" href="#node-role-agent" id="id8">Node role agent</a><ul>
<li><a class="reference internal" href="#id1" id="id9">Future improvements</a></li>
</ul>
</li>
<li><a class="reference internal" href="#risks" id="id10">Risks</a></li>
<li><a class="reference internal" href="#code-status" id="id11">Code status</a></li>
</ul>
</li>
<li><a class="reference internal" href="#future-work" id="id12">Future work</a></li>
</ul>
</li>
</ul>
</div>
<p>This is a design document detailing the integration of Ganeti and Linux HA.</p>
<div class="section" id="current-state-and-shortcomings">
<h2><a class="toc-backref" href="#id3">Current state and shortcomings</a><a class="headerlink" href="#current-state-and-shortcomings" title="Permalink to this headline"></a></h2>
<p>Ganeti doesn’t currently support any self-healing or self-monitoring.</p>
<p>We are now working on trying to improve the situation in this regard:</p>
<ul class="simple">
<li>The <a class="reference internal" href="design-autorepair.html"><span class="doc">autorepair system</span></a> will take care
of self repairing a cluster in the presence of offline nodes.</li>
<li>The <a class="reference internal" href="design-monitoring-agent.html"><span class="doc">monitoring agent</span></a> will take care
of exporting data to monitoring.</li>
</ul>
<p>What is still missing is a way to self-detect “obvious” failures rapidly
and to:</p>
<ul class="simple">
<li>Maintain the master role active.</li>
<li>Offline resource that are obviously faulty so that the autorepair
system can perform its work.</li>
</ul>
</div>
<div class="section" id="proposed-changes">
<h2><a class="toc-backref" href="#id4">Proposed changes</a><a class="headerlink" href="#proposed-changes" title="Permalink to this headline"></a></h2>
<p>Linux-HA provides software that can be used to provide high availability
of services through automatic failover of resources. In particular
Pacemaker can be used together with Heartbeat or Corosync to make sure a
resource is kept active on a self-monitoring cluster.</p>
<div class="section" id="ganeti-ocf-agents">
<h3><a class="toc-backref" href="#id5">Ganeti OCF agents</a><a class="headerlink" href="#ganeti-ocf-agents" title="Permalink to this headline"></a></h3>
<p>The Ganeti agents will be slightly special in the HA world. The
following will apply:</p>
<ul class="simple">
<li>The agents will be able to be configured cluster-wise by tags (which
will be read on the nodes via ssconf_cluster_tags) and locally by
files on the filesystem that will allow them to “simulate” a
particular condition (eg. simulate a failure even if none is
detected).</li>
<li>The agents will be able to run in “full” or “partial” mode: in
“partial” mode they will always succeed, and thus never fail a
resource as long as a node is online, is running the linux HA software
and is responding to the network. In “full” mode they will also check
resources like the cluster master ip or master daemon, and act if they
are missing</li>
</ul>
<p>Note that for what Ganeti does OCF agents are needed: simply relying on
the LSB scripts will not work for the Ganeti service.</p>
</div>
<div class="section" id="master-role-agent">
<h3><a class="toc-backref" href="#id6">Master role agent</a><a class="headerlink" href="#master-role-agent" title="Permalink to this headline"></a></h3>
<p>This agent will manage the Ganeti master role. It needs to be configured
as a sticky resource (you don’t want to flap the master role around, do
you?) that is active on only one node. You can require quorum or fencing
to protect your cluster from multiple masters.</p>
<p>The agent will implement a stateless resource that considers itself
“started” only the master node, “stopped” on all master candidates and
in error mode for all other nodes.</p>
<p>Note that if not all your nodes are master candidates this resource
might have problems:</p>
<ul class="simple">
<li>if all nodes are configured to run the resource, heartbeat may decide
to “fence” (aka stonith) all your non-master-candidate nodes if told
to do so. This might not be what you want.</li>
<li>if only master candidates are configured as nodes for the resource,
beware of promotions and demotions, as nothing will update
automatically pacemaker should a change happen at the Ganeti level.</li>
</ul>
<p>Other solutions, such as reporting the resource just as “stopped” on non
master candidates as well might mean that pacemaker would choose the
“wrong” node to promote to master, which is also a bad idea.</p>
<div class="section" id="future-improvements">
<h4><a class="toc-backref" href="#id7">Future improvements</a><a class="headerlink" href="#future-improvements" title="Permalink to this headline"></a></h4>
<ul class="simple">
<li>Ability to work better with non-master-candidate nodes</li>
<li>Stateful resource that can “safely” transfer the master role between
online nodes (with queue drain and such)</li>
<li>Implement “full” mode, with detection of the cluster IP and the master
node daemon.</li>
</ul>
</div>
</div>
<div class="section" id="node-role-agent">
<h3><a class="toc-backref" href="#id8">Node role agent</a><a class="headerlink" href="#node-role-agent" title="Permalink to this headline"></a></h3>
<p>This agent will manage the Ganeti node role. It needs to be configured
as a cloned resource that is active on all nodes.</p>
<p>In partial mode it will always return success (and thus trigger a
failure only upon an HA level or network failure). Full mode, which
initially will not be implemented, could also check for the node daemon
being unresponsive or other local conditions (TBD).</p>
<p>When a failure happens the HA notification system will trigger on all
other nodes, including the master. The master will then be able to
offline the node. Any other work to restore instance availability should
then be done by the autorepair system.</p>
<p>The following cluster tags are supported:</p>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">ocf:node-offline:use-powercycle</span></code>: Try to powercycle a node using
<code class="docutils literal"><span class="pre">gnt-node</span> <span class="pre">powercycle</span></code> when offlining.</li>
<li><code class="docutils literal"><span class="pre">ocf:node-offline:use-poweroff</span></code>: Try to power off a node using
<code class="docutils literal"><span class="pre">gnt-node</span> <span class="pre">power</span> <span class="pre">off</span></code> when offlining (requires OOB support).</li>
</ul>
<div class="section" id="id1">
<h4><a class="toc-backref" href="#id9">Future improvements</a><a class="headerlink" href="#id1" title="Permalink to this headline"></a></h4>
<ul class="simple">
<li>Handle draining differently than offlining</li>
<li>Handle different modes of “stopping” the service</li>
<li>Implement “full” mode</li>
</ul>
</div>
</div>
<div class="section" id="risks">
<h3><a class="toc-backref" href="#id10">Risks</a><a class="headerlink" href="#risks" title="Permalink to this headline"></a></h3>
<p>Running Ganeti with Pacemaker increases the risk of stability for your
Ganeti Cluster. Events like:</p>
<ul class="simple">
<li>stopping heartbeat or corosync on a node</li>
<li>corosync or heartbeat being killed for any reason</li>
<li>temporary failure in a node’s networking</li>
</ul>
<p>will trigger potentially dangerous operations such as node offlining or
master role failover. Moreover if the autorepair system will be working
they will be able to also trigger instance failovers or migrations, and
disk replaces.</p>
<p>Also note that operations like: master-failover, or manual node-modify
might interact badly with this setup depending on the way your HA system
is configured (see below).</p>
<p>This of course is an inherent problem with any Linux-HA installation,
but is probably more visible with Ganeti given that our resources tend
to be more heavyweight than many others managed in HA clusters (eg. an
IP address).</p>
</div>
<div class="section" id="code-status">
<h3><a class="toc-backref" href="#id11">Code status</a><a class="headerlink" href="#code-status" title="Permalink to this headline"></a></h3>
<p>This code is heavily experimental, and Linux-HA is a very complex
subsystem. <em>We might not be able to help you</em> if you decide to run this
code: please make sure you understand fully high availability on your
production machines. Ganeti only ships this code as an example but it
might need customization or complex configurations on your side for it
to run properly.</p>
<p><em>Ganeti does not automate HA configuration for your cluster</em>. You need
to do this job by hand. Good luck, don’t get it wrong.</p>
</div>
</div>
<div class="section" id="future-work">
<h2><a class="toc-backref" href="#id12">Future work</a><a class="headerlink" href="#future-work" title="Permalink to this headline"></a></h2>
<ul class="simple">
<li>Integrate the agents better with the ganeti monitoring</li>
<li>Add hooks for managing HA at node add/remove/modify/master-failover
operations</li>
<li>Provide a stonith system through Ganeti’s OOB system</li>
<li>Provide an OOB system that does “shunning” of offline nodes, for
emulating a real OOB, at least on all nodes</li>
</ul>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Linux HA integration</a><ul>
<li><a class="reference internal" href="#current-state-and-shortcomings">Current state and shortcomings</a></li>
<li><a class="reference internal" href="#proposed-changes">Proposed changes</a><ul>
<li><a class="reference internal" href="#ganeti-ocf-agents">Ganeti OCF agents</a></li>
<li><a class="reference internal" href="#master-role-agent">Master role agent</a><ul>
<li><a class="reference internal" href="#future-improvements">Future improvements</a></li>
</ul>
</li>
<li><a class="reference internal" href="#node-role-agent">Node role agent</a><ul>
<li><a class="reference internal" href="#id1">Future improvements</a></li>
</ul>
</li>
<li><a class="reference internal" href="#risks">Risks</a></li>
<li><a class="reference internal" href="#code-status">Code status</a></li>
</ul>
</li>
<li><a class="reference internal" href="#future-work">Future work</a></li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="design-location.html"
                        title="previous chapter">Improving location awareness of Ganeti</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="design-lu-generated-jobs.html"
                        title="next chapter">Submitting jobs from logical units</a></p>
  <div role="note" aria-label="source link">
    <h3>This Page</h3>
    <ul class="this-page-menu">
      <li><a href="_sources/design-linuxha.rst.txt"
            rel="nofollow">Show Source</a></li>
    </ul>
   </div>
<div id="searchbox" style="display: none" role="search">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <div><input type="text" name="q" /></div>
      <div><input type="submit" value="Go" /></div>
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="design-lu-generated-jobs.html" title="Submitting jobs from logical units"
             >next</a></li>
        <li class="right" >
          <a href="design-location.html" title="Improving location awareness of Ganeti"
             >previous</a> |</li>
        <li class="nav-item nav-item-0"><a href="index.html">Ganeti 2.16.0~rc2 documentation</a> &#187;</li> 
      </ul>
    </div>
    <div class="footer" role="contentinfo">
        &#169; Copyright 2018, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Google Inc..
      Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.6.7.
    </div>
  </body>
</html>