/usr/share/doc/libchemps2/html/resources.html is in chemps2-doc 1.8.3-2.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>7. Typical resource requirements — CheMPS2 1.8.3 (2016-11-15) documentation</title>
<link rel="stylesheet" href="_static/classic.css" type="text/css" />
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: './',
VERSION: '1.8.3 (2016-11-15)',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="/usr/share/javascript/jquery/jquery.js"></script>
<script type="text/javascript" src="/usr/share/javascript/underscore/underscore.js"></script>
<script type="text/javascript" src="_static/doctools.js"></script>
<script type="text/javascript" src="/usr/share/javascript/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="top" title="CheMPS2 1.8.3 (2016-11-15) documentation" href="index.html" />
<link rel="next" title="8. DMRG-SCF" href="dmrgscf.html" />
<link rel="prev" title="6. DMRG calculations" href="inoutput.html" />
</head>
<body role="document">
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="dmrgscf.html" title="8. DMRG-SCF"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="inoutput.html" title="6. DMRG calculations"
accesskey="P">previous</a> |</li>
<li class="nav-item nav-item-0"><a href="index.html">CheMPS2 1.8.3 (2016-11-15) documentation</a> »</li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<span class="target" id="index-0"></span><span class="target" id="index-1"></span><span class="target" id="index-2"></span><div class="section" id="typical-resource-requirements">
<span id="index-3"></span><h1>7. Typical resource requirements<a class="headerlink" href="#typical-resource-requirements" title="Permalink to this headline">¶</a></h1>
<p>In this section, typical resource requirements for DMRG calculations are discussed. With <span class="math">\(L\)</span> spatial orbitals and <span class="math">\(D\)</span> virtual basis states, the algorithm has a theoretical scaling per sweep of</p>
<ul class="simple">
<li><span class="math">\(\mathcal{O}(L^4D^2 + L^3D^3)\)</span> in CPU time</li>
<li><span class="math">\(\mathcal{O}(L^2D^2)\)</span> in memory</li>
<li><span class="math">\(\mathcal{O}(L^3D^2)\)</span> in disk</li>
</ul>
<p>The block-sparsity and information compression due to the exploitation of symmetry have not been taken into account in these scalings!</p>
<div class="section" id="scaling-with-system-size">
<h2>7.1. Scaling with system size<a class="headerlink" href="#scaling-with-system-size" title="Permalink to this headline">¶</a></h2>
<p>Ref. <a class="reference internal" href="#timing1" id="id1">[TIMING1]</a> contains CPU time measurements for polyenes of increasing length, and demonstrates the scaling of CheMPS2 with <span class="math">\(L\)</span>. The geometries of all-trans polyenes <span class="math">\(\text{C}_n\text{H}_{n+2}\)</span> were optimized at the B3LYP/6-31G** level of theory for <span class="math">\(n=12\)</span>, 14, 16, 18, 20, 22 and 24. The <span class="math">\(\sigma\)</span>-orbitals were kept frozen at the RHF/6-31G level of theory. The <span class="math">\(\pi\)</span>-orbitals in the 6-31G basis were localized by means of the Edmiston-Ruedenberg localization procedure. The localized <span class="math">\(\pi\)</span>-orbitals belong to the <span class="math">\(\mathsf{A''}\)</span> irrep of the <span class="math">\(\mathsf{C_s}\)</span> point group, and were ordered according to the one-dimensional topology of the polyene. For all polyenes, the average CPU time per DMRG sweep (in seconds) was determined with snapshot <a class="reference external" href="https://github.com/SebWouters/CheMPS2/commit/d520e9e5af1c16621537f2bb51f3ae6f398c3ab8">d520e9e5af1c16621537f2bb51f3ae6f398c3ab8</a> from the <a class="reference external" href="https://github.com/sebwouters/chemps2">CheMPS2 github repository</a> on a single Intel Xeon Sandy Bridge (E5-2670) core @ 2.6 GHz. For the two values of <span class="math">\(D\)</span> shown in the figure, the energies are converged to <span class="math">\(\mu E_h\)</span> accuracy due to the one-dimensional topology of the localized and ordered <span class="math">\(\pi\)</span>-orbitals. Due to the imposed <span class="math">\(\mathsf{SU(2)} \otimes \mathsf{U(1)} \otimes \mathsf{C_s}\)</span> symmetry, all tensors become block-sparse, which causes the scaling to be below <span class="math">\(\mathcal{O}(L^4)\)</span>.</p>
<img alt="_images/polyene_scaling.png" src="_images/polyene_scaling.png" />
</div>
<div class="section" id="n2-cc-pvdz">
<h2>7.2. N2/cc-pVDZ<a class="headerlink" href="#n2-cc-pvdz" title="Permalink to this headline">¶</a></h2>
<p>The nitrogen dimer in the cc-pVDZ basis has an active space of 14 electrons in 28 orbitals. The exploited point group in the calculations was <span class="math">\(\mathsf{D_{2h}}\)</span>, and the targeted state was <span class="math">\(\mathsf{X^1\Sigma_g^+}\)</span> at equilibrium bond length: 2.118 a.u. This system was first studied with DMRG in Ref. <a class="reference internal" href="#nitrogen" id="id2">[NITROGEN]</a>. The listed CheMPS2 timings are wall times per sweep (in seconds) on 16 Intel Xeon Sandy Bridge (E5-2670) cores @ 2.6 GHz. The calculation was performed with snapshot <a class="reference external" href="https://github.com/SebWouters/CheMPS2/commit/045393b439821c81d800328c0b4b8b1732da47f8">045393b439821c81d800328c0b4b8b1732da47f8</a> from the <a class="reference external" href="https://github.com/sebwouters/chemps2">CheMPS2 github repository</a>. The orbitals were reordered with <code class="docutils literal"><span class="pre">void</span> <span class="pre">CheMPS2::Problem::SetupReorderD2h()</span></code>. The residual norm tolerance for the Davidson algorithm was set to <span class="math">\(10^{-5}\)</span>. OpenMP parallelization on a single node was used, and the calculation needed ~ 6 Gb of memory.</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="29%" />
<col width="26%" />
<col width="21%" />
<col width="24%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head"><span class="math">\(D_{\mathsf{SU(2)}}\)</span></th>
<th class="head">Wall time per sweep (s)</th>
<th class="head"><span class="math">\(w_D^{disc}\)</span></th>
<th class="head"><span class="math">\(E_D\)</span> (Hartree)</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>1000</td>
<td>48</td>
<td>9.8027e-07</td>
<td>-109.28209711</td>
</tr>
<tr class="row-odd"><td>1500</td>
<td>113</td>
<td>3.9381e-07</td>
<td>-109.28214593</td>
</tr>
<tr class="row-even"><td>2000</td>
<td>219</td>
<td>1.8910e-07</td>
<td>-109.28216077</td>
</tr>
<tr class="row-odd"><td>2500</td>
<td>371</td>
<td>1.0083e-07</td>
<td>-109.28216667</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
<div class="section" id="h2o-roos-ano-dz">
<span id="label-water-roos-ano-dz"></span><h2>7.3. H2O/Roos’ ANO DZ<a class="headerlink" href="#h2o-roos-ano-dz" title="Permalink to this headline">¶</a></h2>
<p>Water in Roos’ ANO DZ basis has an active space of 10 electrons in 41 orbitals. The exploited point group in the calculations was <span class="math">\(\mathsf{C_{2v}}\)</span>, and the targeted state was <span class="math">\(\mathsf{^1A_1}\)</span> at equilibrium geometry: O @ (0, 0, 0) and H @ (± 0.790689766, 0, 0.612217330) Angstrom. This system was first studied with DMRG in Ref. <a class="reference internal" href="#water" id="id4">[WATER]</a>. The listed CheMPS2 timings are wall times per sweep (in seconds) on 20 Intel Xeon Ivy Bridge (E5-2670 v2) cores @ 2.5 GHz. The calculation was performed with snapshot <a class="reference external" href="https://github.com/SebWouters/CheMPS2/commit/045393b439821c81d800328c0b4b8b1732da47f8">045393b439821c81d800328c0b4b8b1732da47f8</a> from the <a class="reference external" href="https://github.com/sebwouters/chemps2">CheMPS2 github repository</a>. The residual norm tolerance for the Davidson algorithm was set to <span class="math">\(10^{-5}\)</span>. OpenMP parallelization on a single node was used, and the calculation needed ~ 64 Gb of memory.</p>
<blockquote>
<div><table border="1" class="docutils">
<colgroup>
<col width="29%" />
<col width="26%" />
<col width="21%" />
<col width="24%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head"><span class="math">\(D_{\mathsf{SU(2)}}\)</span></th>
<th class="head">Wall time per sweep (s)</th>
<th class="head"><span class="math">\(w_D^{disc}\)</span></th>
<th class="head"><span class="math">\(E_D\)</span> (Hartree)</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>1000</td>
<td>401</td>
<td>8.7950e-08</td>
<td>-76.31468302</td>
</tr>
<tr class="row-odd"><td>2000</td>
<td>2111</td>
<td>1.1366e-08</td>
<td>-76.31471044</td>
</tr>
<tr class="row-even"><td>3000</td>
<td>5686</td>
<td>2.9114e-09</td>
<td>-76.31471342</td>
</tr>
<tr class="row-odd"><td>4000</td>
<td>10958</td>
<td>6.8011e-10</td>
<td>-76.31471402</td>
</tr>
</tbody>
</table>
</div></blockquote>
</div>
<div class="section" id="hybrid-parallelization">
<h2>7.4. Hybrid parallelization<a class="headerlink" href="#hybrid-parallelization" title="Permalink to this headline">¶</a></h2>
<p>CheMPS2 contains a hybrid MPI and OpenMP parallelization for mixed distributed and shared memory architectures. In Ref. <a class="reference internal" href="#timing2" id="id7">[TIMING2]</a> this hybrid parallelization is illustrated for H2O in Roos’ ANO DZ basis, the system studied <a class="reference internal" href="#label-water-roos-ano-dz"><span class="std std-ref">above</span></a>. The speedups achieved with snapshot <a class="reference external" href="https://github.com/SebWouters/CheMPS2/commit/045393b439821c81d800328c0b4b8b1732da47f8">045393b439821c81d800328c0b4b8b1732da47f8</a> of the <a class="reference external" href="https://github.com/sebwouters/chemps2">CheMPS2 github repository</a> and <a class="reference external" href="https://github.com/sanshar/block/releases/tag/v1.1-alpha">Block version 1.1-alpha</a> are shown in the figures below. All calculations were performed with reduced virtual dimension <span class="math">\(D=1000\)</span>. RHF orbitals were used, ordered per irreducible representation of <span class="math">\(\mathsf{C_{2v}}\)</span> according to their single-particle energy, and the irreducible representations were ordered as <span class="math">\(\{ \mathsf{A1},~\mathsf{A2},~\mathsf{B1},~\text{and}~\mathsf{B2} \}\)</span>. The residual norm tolerance for the Davidson algorithm was set to <span class="math">\(10^{-4}\)</span>. Note that in Block the square of this parameter needs to be passed. Each node has a dual Intel Xeon Sandy Bridge E5-2670 (total of 16 cores at 2.6 GHz) and 64 GB of memory. The nodes are connected with FDR InfiniBand. The renormalized operators were stored on GPFS in order to achieve high disk bandwidths. Both codes and all depending libraries were compiled with the Intel MPI compiler version 2015.1.133. The Intel Math Kernel Library version 11.2.1.133 was used for BLAS and LAPACK routines.</p>
<img alt="_images/single_node_h2o.png" src="_images/single_node_h2o.png" />
<p>Figure above: Comparison of pure MPI and OpenMP speedups on a single node. Wall times per sweep are indicated for 16 cores (in seconds).</p>
<img alt="_images/multi_node_h2o.png" src="_images/multi_node_h2o.png" />
<p>Figure above: Illustration of the hybrid parallelization of CheMPS2. For 16 cores and less, one MPI process with several OpenMP threads is used. For 32 cores and more, several MPI processes each with 16 OpenMP threads are used. Wall times per sweep are indicated (in seconds).</p>
<table class="docutils citation" frame="void" id="timing1" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id1">[TIMING1]</a></td><td><ol class="first last upperalpha simple" start="19">
<li>Wouters and D. Van Neck, <em>European Physical Journal D</em> <strong>68</strong>, 272 (2014), doi: <a class="reference external" href="http://dx.doi.org/10.1140/epjd/e2014-50500-1">10.1140/epjd/e2014-50500-1</a></li>
</ol>
</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="timing2" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id7">[TIMING2]</a></td><td><ol class="first last upperalpha simple" start="19">
<li>Wouters, V. Van Speybroeck and D. Van Neck, <em>Journal of Chemical Physics</em> <strong>145</strong>, 054120 (2016), doi: <a class="reference external" href="http://dx.doi.org/10.1063/1.4959817">10.1063/1.4959817</a></li>
</ol>
</td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="nitrogen" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id2">[NITROGEN]</a></td><td>G.K.-L. Chan, M. Kallay and J. Gauss, <em>Journal of Chemical Physics</em> <strong>121</strong>, 6110 (2004), doi: <a class="reference external" href="http://dx.doi.org/10.1063/1.1783212">10.1063/1.1783212</a></td></tr>
</tbody>
</table>
<table class="docutils citation" frame="void" id="water" rules="none">
<colgroup><col class="label" /><col /></colgroup>
<tbody valign="top">
<tr><td class="label"><a class="fn-backref" href="#id4">[WATER]</a></td><td><ol class="first last upperalpha simple" start="7">
<li>K.-L. Chan and M. Head-Gordon, <em>Journal of Chemical Physics</em> <strong>118</strong>, 8551 (2003), doi: <a class="reference external" href="http://dx.doi.org/10.1063/1.1574318">10.1063/1.1574318</a></li>
</ol>
</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<p class="logo"><a href="index.html">
<img class="logo" src="_static/CheMPS2logo.png" alt="Logo"/>
</a></p>
<h3><a href="index.html">Table Of Contents</a></h3>
<ul>
<li><a class="reference internal" href="#">7. Typical resource requirements</a><ul>
<li><a class="reference internal" href="#scaling-with-system-size">7.1. Scaling with system size</a></li>
<li><a class="reference internal" href="#n2-cc-pvdz">7.2. N2/cc-pVDZ</a></li>
<li><a class="reference internal" href="#h2o-roos-ano-dz">7.3. H2O/Roos’ ANO DZ</a></li>
<li><a class="reference internal" href="#hybrid-parallelization">7.4. Hybrid parallelization</a></li>
</ul>
</li>
</ul>
<h4>Previous topic</h4>
<p class="topless"><a href="inoutput.html"
title="previous chapter">6. DMRG calculations</a></p>
<h4>Next topic</h4>
<p class="topless"><a href="dmrgscf.html"
title="next chapter">8. DMRG-SCF</a></p>
<div role="note" aria-label="source link">
<h3>This Page</h3>
<ul class="this-page-menu">
<li><a href="_sources/resources.txt"
rel="nofollow">Show Source</a></li>
</ul>
</div>
<div id="searchbox" style="display: none" role="search">
<h3>Quick search</h3>
<form class="search" action="search.html" method="get">
<div><input type="text" name="q" /></div>
<div><input type="submit" value="Go" /></div>
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="genindex.html" title="General Index"
>index</a></li>
<li class="right" >
<a href="dmrgscf.html" title="8. DMRG-SCF"
>next</a> |</li>
<li class="right" >
<a href="inoutput.html" title="6. DMRG calculations"
>previous</a> |</li>
<li class="nav-item nav-item-0"><a href="index.html">CheMPS2 1.8.3 (2016-11-15) documentation</a> »</li>
</ul>
</div>
<div class="footer" role="contentinfo">
© Copyright 2013-2016, Sebastian Wouters.
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.4.9.
</div>
</body>
</html>
|