/usr/share/doc/cedar-backup2-doc/manual/manual.html

<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Cedar Backup 2 Software Manual</title><link rel="stylesheet" type="text/css" href="styles.css"><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="book"><div class="titlepage"><div><div><h1 class="title"><a name="cedar"></a>Cedar Backup 2 Software Manual</h1></div><div><div class="authorgroup"><div class="author"><h3 class="author"><span class="firstname">Kenneth J.</span> <span class="surname">Pronovici</span></h3></div></div></div><div><p class="copyright">Copyright © 2005-2008,2013-2015 Kenneth J. Pronovici</p></div><div><div class="legalnotice"><a name="idp53898400"></a><p>
            This work is free; you can redistribute it and/or modify it under
            the terms of the GNU General Public License (the "GPL"), Version 2,
            as published by the Free Software Foundation.
         </p><p>
            For the purposes of the GPL, the "preferred form of modification"
            for this work is the original Docbook XML text files.  If you
            choose to distribute this work in a compiled form (i.e. if you
            distribute HTML, PDF or Postscript documents based on the original
            Docbook XML text files), you must also consider image files to be
            "source code" if those images are required in order to construct a
            complete and readable compiled version of the work.
         </p><p>
            This work is distributed in the hope that it will be useful,
            but WITHOUT ANY WARRANTY; without even the implied warranty of
            MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
         </p><p>
            Copies of the GNU General Public License are available from
            the Free Software Foundation website,
            <code class="systemitem">http://www.gnu.org/</code>.
            You may also write the Free Software Foundation, Inc., 
            51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA
         </p></div></div></div><hr></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="preface"><a href="#cedar-preface">Preface</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-preface-purpose">Purpose</a></span></dt><dt><span class="sect1"><a href="#cedar-preface-audience">Audience</a></span></dt><dt><span class="sect1"><a href="#cedar-preface-conventions">Conventions Used in This Book</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-preface-conventions-typo">Typographic Conventions</a></span></dt><dt><span class="sect2"><a href="#cedar-preface-conventions-icons">Icons</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-preface-organization">Organization of This Manual</a></span></dt><dt><span class="sect1"><a href="#cedar-preface-acknowlege">Acknowledgments</a></span></dt></dl></dd><dt><span class="chapter"><a href="#cedar-intro">1. Introduction</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-intro-whatis">What is Cedar Backup?</a></span></dt><dt><span class="sect1"><a href="#cedar-intro-migrating">Migrating from Version 2 to Version 3</a></span></dt><dt><span class="sect1"><a href="#cedar-intro-support">How to Get Support</a></span></dt><dt><span class="sect1"><a href="#cedar-intro-history">History</a></span></dt></dl></dd><dt><span class="chapter"><a href="#cedar-basic">2. Basic Concepts</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-basic-general">General Architecture</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-datarecovery">Data Recovery</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-pools">Cedar Backup Pools</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-process">The Backup Process</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-basic-process-collect">The Collect Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-stage">The Stage Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-store">The Store Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-purge">The Purge Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-all">The All Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-validate">The Validate Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-initialize">The Initialize Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-rebuild">The Rebuild Action</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-basic-coordinate">Coordination between Master and Clients</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-managedbackups">Managed Backups</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-mediadevice">Media and Device Types</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-incremental">Incremental Backups</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-extensions">Extensions</a></span></dt></dl></dd><dt><span class="chapter"><a href="#cedar-install">3. Installation</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-install-background">Background</a></span></dt><dt><span class="sect1"><a href="#cedar-install-debian">Installing on a Debian System</a></span></dt><dt><span class="sect1"><a href="#cedar-install-source">Installing from Source</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-install-source-deps">Installing Dependencies</a></span></dt><dt><span class="sect2"><a href="#cedar-install-source-package">Installing the Source Package</a></span></dt></dl></dd></dl></dd><dt><span class="chapter"><a href="#cedar-commandline">4. Command Line Tools</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-commandline-overview">Overview</a></span></dt><dt><span class="sect1"><a href="#cedar-commandline-cback">The <span class="command"><strong>cback</strong></span> command</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-commandline-cback-intro">Introduction</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cback-syntax">Syntax</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cback-options">Switches</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cback-actions">Actions</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-commandline-sync">The <span class="command"><strong>cback-amazons3-sync</strong></span> command</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-commandline-sync-intro">Introduction</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-sync-syntax">Syntax</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-sync-options">Switches</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-commandline-cbackspan">The <span class="command"><strong>cback-span</strong></span> command</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-intro">Introduction</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-syntax">Syntax</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-options">Switches</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-using">Using <span class="command"><strong>cback-span</strong></span></a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-sample">Sample run</a></span></dt></dl></dd></dl></dd><dt><span class="chapter"><a href="#cedar-config">5. Configuration</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-config-overview">Overview</a></span></dt><dt><span class="sect1"><a href="#cedar-config-configfile">Configuration File Format</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-config-configfile-sample">Sample Configuration File</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-reference">Reference Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-options">Options Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-peers">Peers Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-collect">Collect Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-stage">Stage Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-store">Store Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-purge">Purge Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-extensions">Extensions Configuration</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-poolofone">Setting up a Pool of One</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60200128">Step 1: Decide when you will run your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60205360">Step 2: Make sure email works.</a></span></dt><dt><span class="sect2"><a href="#idp60208656">Step 3: Configure your writer device.</a></span></dt><dt><span class="sect2"><a href="#idp60216656">Step 4: Configure your backup user.</a></span></dt><dt><span class="sect2"><a href="#idp60221824">Step 5: Create your backup tree.</a></span></dt><dt><span class="sect2"><a href="#idp60231984">Step 6: Create the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60238336">Step 7: Validate the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60242016">Step 8: Test your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60248384">Step 9: Modify the backup cron jobs.</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-client">Setting up a Client Peer Node</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60264304">Step 1: Decide when you will run your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60269536">Step 2: Make sure email works.</a></span></dt><dt><span class="sect2"><a href="#idp59575344">Step 3: Configure the master in your backup pool.</a></span></dt><dt><span class="sect2"><a href="#idp59579776">Step 4: Configure your backup user.</a></span></dt><dt><span class="sect2"><a href="#idp60301904">Step 5: Create your backup tree.</a></span></dt><dt><span class="sect2"><a href="#idp60310656">Step 6: Create the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60317008">Step 7: Validate the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60321184">Step 8: Test your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60324128">Step 9: Modify the backup cron jobs.</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-master">Setting up a Master Peer Node</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60341344">Step 1: Decide when you will run your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60347152">Step 2: Make sure email works.</a></span></dt><dt><span class="sect2"><a href="#idp60350016">Step 3: Configure your writer device.</a></span></dt><dt><span class="sect2"><a href="#idp60358016">Step 4: Configure your backup user.</a></span></dt><dt><span class="sect2"><a href="#idp60372064">Step 5: Create your backup tree.</a></span></dt><dt><span class="sect2"><a href="#idp60381504">Step 6: Create the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60390640">Step 7: Validate the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60394880">Step 8: Test connectivity to client machines.</a></span></dt><dt><span class="sect2"><a href="#idp60399696">Step 9: Test your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60407776">Step 10: Modify the backup cron jobs.</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-writer">Configuring your Writer Device</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60419328">Device Types</a></span></dt><dt><span class="sect2"><a href="#idp60421696">Devices identified by by device name</a></span></dt><dt><span class="sect2"><a href="#idp60422768">Devices identified by SCSI id</a></span></dt><dt><span class="sect2"><a href="#idp60431648">Linux Notes</a></span></dt><dt><span class="sect2"><a href="#idp60437936">Finding your Linux CD Writer</a></span></dt><dt><span class="sect2"><a href="#idp60448992">Mac OS X Notes</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-blanking">Optimized Blanking Stategy</a></span></dt></dl></dd><dt><span class="chapter"><a href="#cedar-extensions">6. Official Extensions</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-extensions-sysinfo">System Information Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-amazons3">Amazon S3 Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-subversion">Subversion Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-mysql">MySQL Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-postgresql">PostgreSQL Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-mbox">Mbox Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-encrypt">Encrypt Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-split">Split Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-capacity">Capacity Extension</a></span></dt></dl></dd><dt><span class="appendix"><a href="#cedar-extenspec">A. Extension Architecture Interface</a></span></dt><dt><span class="appendix"><a href="#cedar-depends">B. Dependencies</a></span></dt><dt><span class="appendix"><a href="#cedar-recovering">C. Data Recovery</a></span></dt><dd><dl><dt><span class="sect1"><a href="#cedar-recovering-finding">Finding your Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-filesystem">Recovering Filesystem Data</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-recovering-filesystem-full">Full Restore</a></span></dt><dt><span class="sect2"><a href="#cedar-recovering-filesystem-partial">Partial Restore</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-recovering-mysql">Recovering MySQL Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-subversion">Recovering Subversion Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-mbox">Recovering Mailbox Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-split">Recovering Data split by the Split Extension</a></span></dt></dl></dd><dt><span class="appendix"><a href="#cedar-securingssh">D. Securing Password-less SSH Connections</a></span></dt><dt><span class="appendix"><a href="#cedar-copyright">E. Copyright</a></span></dt></dl></div><div class="preface"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-preface"></a>Preface</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-preface-purpose">Purpose</a></span></dt><dt><span class="sect1"><a href="#cedar-preface-audience">Audience</a></span></dt><dt><span class="sect1"><a href="#cedar-preface-conventions">Conventions Used in This Book</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-preface-conventions-typo">Typographic Conventions</a></span></dt><dt><span class="sect2"><a href="#cedar-preface-conventions-icons">Icons</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-preface-organization">Organization of This Manual</a></span></dt><dt><span class="sect1"><a href="#cedar-preface-acknowlege">Acknowledgments</a></span></dt></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-preface-purpose"></a>Purpose</h2></div></div></div><p>
         This software manual has been written to document version 2 of
         Cedar Backup, originally released in early 2005.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-preface-audience"></a>Audience</h2></div></div></div><p>
         This manual has been written for computer-literate administrators who
         need to use and configure Cedar Backup on their Linux or UNIX-like
         system.  The examples in this manual assume the reader is relatively
         comfortable with UNIX and command-line interfaces.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-preface-conventions"></a>Conventions Used in This Book</h2></div></div></div><p>
         This section covers the various conventions used in this manual.
      </p><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-preface-conventions-typo"></a>Typographic Conventions</h3></div></div></div><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="filename">Term</code></span></dt><dd><p>Used for first use of important terms.</p></dd><dt><span class="term"><span class="command"><strong>Command</strong></span></span></dt><dd><p>Used for commands, command output, and switches</p></dd><dt><span class="term"><em class="replaceable"><code>Replaceable</code></em></span></dt><dd><p>Used for replaceable items in code and text</p></dd><dt><span class="term"><code class="filename">Filenames</code></span></dt><dd><p>Used for file and directory names</p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-preface-conventions-icons"></a>Icons</h3></div></div></div><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               This icon designates a note relating to the surrounding text.
            </p></div><div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Tip</h3><p>
               This icon designates a helpful tip relating to the surrounding text.
            </p></div><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               This icon designates a warning relating to the surrounding text.
            </p></div></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-preface-organization"></a>Organization of This Manual</h2></div></div></div><div class="variablelist"><dl class="variablelist"><dt><span class="term"><a class="xref" href="#cedar-intro" title="Chapter 1. Introduction">Chapter 1, <i>Introduction</i></a></span></dt><dd><p>
                  Provides some some general history about Cedar Backup, what
                  needs it is intended to meet, how to get support, and how to
                  migrate from version 2 to version 3.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a></span></dt><dd><p>
                  Discusses the basic concepts of a Cedar Backup infrastructure,
                  and specifies terms used throughout the rest of the manual.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-install" title="Chapter 3. Installation">Chapter 3, <i>Installation</i></a></span></dt><dd><p>
                  Explains how to install the Cedar Backup package either from
                  the Python source distribution or from the Debian package.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-commandline" title="Chapter 4. Command Line Tools">Chapter 4, <i>Command Line Tools</i></a></span></dt><dd><p>
                  Discusses the various Cedar Backup command-line tools,
                  including the primary <span class="command"><strong>cback</strong></span> command.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-config" title="Chapter 5. Configuration">Chapter 5, <i>Configuration</i></a></span></dt><dd><p>
                  Provides detailed information about how to configure Cedar
                  Backup.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-extensions" title="Chapter 6. Official Extensions">Chapter 6, <i>Official Extensions</i></a></span></dt><dd><p>
                  Describes each of the officially-supported Cedar Backup
                  extensions.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-extenspec" title="Appendix A. Extension Architecture Interface">Appendix A, <i>Extension Architecture Interface</i></a></span></dt><dd><p>
                  Specifies the Cedar Backup extension architecture interface,
                  through which third party developers can write extensions to
                  Cedar Backup.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-depends" title="Appendix B. Dependencies">Appendix B, <i>Dependencies</i></a></span></dt><dd><p>
                  Provides some additional information about the packages which
                  Cedar Backup relies on, including information about how to
                  find documentation and packages on non-Debian systems.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-recovering" title="Appendix C. Data Recovery">Appendix C, <i>Data Recovery</i></a></span></dt><dd><p>
                  Cedar Backup provides no facility for restoring backups,
                  assuming the administrator can handle this infrequent task.
                  This appendix provides some notes for administrators to work
                  from.
               </p></dd><dt><span class="term"><a class="xref" href="#cedar-securingssh" title="Appendix D. Securing Password-less SSH Connections">Appendix D, <i>Securing Password-less SSH Connections</i></a></span></dt><dd><p>
                  Password-less SSH connections are a necessary evil when
                  remote backup processes need to execute without human
                  interaction.  This appendix describes some ways that you can
                  reduce the risk to your backup pool should your master
                  machine be compromised.
               </p></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-preface-acknowlege"></a>Acknowledgments</h2></div></div></div><p>
         The structure of this manual and some of the basic boilerplate has been
         taken from the book <a class="ulink" href="http://svnbook.red-bean.com/" target="_top">Version
         Control with Subversion</a>.  Thanks to the authors (and
         O'Reilly) for making this excellent reference available under a free
         and open license.
      </p></div></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-intro"></a>Chapter 1. Introduction</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-intro-whatis">What is Cedar Backup?</a></span></dt><dt><span class="sect1"><a href="#cedar-intro-migrating">Migrating from Version 2 to Version 3</a></span></dt><dt><span class="sect1"><a href="#cedar-intro-support">How to Get Support</a></span></dt><dt><span class="sect1"><a href="#cedar-intro-history">History</a></span></dt></dl></div><div class="simplesect"><div class="titlepage"></div><div class="blockquote"><blockquote class="blockquote"><p>
            <span class="quote">&#8220;<span class="quote">Only wimps use tape backup: real men just upload their
            important stuff on ftp, and let the rest of the world mirror
            it.</span>&#8221;</span>&#8212; Linus Torvalds, at the release of Linux 
            2.0.8 in July of 1996.
          </p></blockquote></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-intro-whatis"></a>What is Cedar Backup?</h2></div></div></div><p>
         Cedar Backup is a software package designed to manage system
         backups for a pool of local and remote machines.  Cedar Backup
         understands how to back up filesystem data as well as MySQL and
         PostgreSQL databases and Subversion repositories.  It can also be
         easily extended to support other kinds of data sources.
      </p><p>
         Cedar Backup is focused around weekly backups to a single CD or DVD
         disc, with the expectation that the disc will be changed or
         overwritten at the beginning of each week.  If your hardware is new
         enough (and almost all hardware is today), Cedar Backup can write
         multisession discs, allowing you to add incremental data to a disc on
         a daily basis.
      </p><p>
         Alternately, Cedar Backup can write your backups to the Amazon S3 cloud
         rather than relying on physical media.
      </p><p>
         Besides offering command-line utilities to manage the backup process,
         Cedar Backup provides a well-organized library of backup-related
         functionality, written in the Python 2 programming language.
      </p><p>
         There are many different backup software implementations out there in
         the open source world. Cedar Backup aims to fill a niche: it aims to
         be a good fit for people who need to back up a limited amount of
         important data on a regular basis. Cedar Backup isn't for you if you
         want to back up your huge MP3 collection every night, or if you want
         to back up a few hundred machines. However, if you administer a small
         set of machines and you want to run daily incremental backups for
         things like system configuration, current email, small web sites,
         Subversion or Mercurial repositories, or small MySQL databases, then
         Cedar Backup is probably worth your time.
      </p><p>
         Cedar Backup has been developed on a Debian GNU/Linux system and is
         primarily supported on Debian and other Linux systems.  However, since
         it is written in portable Python 2, it should run without problems on
         just about any UNIX-like operating system.  In particular, full Cedar
         Backup functionality is known to work on Debian and SuSE Linux
         systems, and client functionality is also known to work on FreeBSD and
         Mac OS X systems.
      </p><p>
         To run a Cedar Backup client, you really just need a working Python 2
         installation.  To run a Cedar Backup master, you will also need a set
         of other executables, most of which are related to building and
         writing CD/DVD images or talking to the Amazon S3 infrastructure.  A
         full list of dependencies is provided in 
         <a class="xref" href="#cedar-install-source-deps" title="Installing Dependencies">the section called &#8220;Installing Dependencies&#8221;</a>.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-intro-migrating"></a>Migrating from Version 2 to Version 3</h2></div></div></div><p>
         The main difference between Cedar Backup version 2 and Cedar Backup
         version 3 is the targeted Python interpreter.  Cedar Backup version 2
         was designed for Python 2, while version 3 is a conversion of the
         original code to Python 3.  Other than that, both versions are
         functionally equivalent.  The configuration format is unchanged, and
         you can mix-and-match masters and clients of different versions in the
         same backup pool.  Both versions will be fully supported until around
         the time of the Python 2 end-of-life in 2020, but you should plan to
         migrate sooner than that if possible.
      </p><p>
         A major design goal for version 3 was to facilitate easy migration
         testing for users, by making it possible to install version 3 on the
         same server where version 2 was already in use.  A side effect of this
         design choice is that all of the executables, configuration files, and
         logs changed names in version 3.  Where version 2 used
         "cback", version 3 uses "cback3":
         <code class="filename">cback3.conf</code> instead of
         <code class="filename">cback.conf</code>, <code class="filename">cback3.log</code> instead
         of <code class="filename">cback.log</code>, etc.
      </p><p>
         So, while migrating from version 2 to version 3 is relatively
         straightforward, you will have to make some changes manually.  You
         will need to create a new configuration file (or soft link to the old
         one), modify your cron jobs to use the new executable name, etc.  You
         can migrate one server at a time in your pool with no ill effects, or
         even incrementally migrate a single server by using version 2 and
         version 3 on different days of the week or for different parts of the
         backup.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-intro-support"></a>How to Get Support</h2></div></div></div><p>
         Cedar Backup is open source software that is provided to you at no
         cost.  It is provided with no warranty,  not even for MERCHANTABILITY
         or FITNESS FOR A PARTICULAR PURPOSE.  However, that said, someone can
         usually help you solve whatever problems you might see.  
      </p><p>
         If you experience a problem, your best bet is to file an issue in
         the issue tracker at BitBucket.
         <a href="#ftn.idp58733040" class="footnote" name="idp58733040"><sup class="footnote">[1]</sup></a>
         When the source code was hosted at SourceForge, there was a mailing
         list.  However, it was very lightly used in the last years before I
         abandoned SourceForge, and I have decided not to replace it.
      </p><p>
         If you are not comfortable discussing your problem in public or
         listing it in a public database, or if you need to send along
         information that you do not want made public, then you can write
         <code class="email">&lt;<a class="email" href="mailto:support@cedar-solutions.com">support@cedar-solutions.com</a>&gt;</code>.  That mail will go
         directly to me.  If you write the support address about a bug, a
         <span class="quote">&#8220;<span class="quote">scrubbed</span>&#8221;</span> bug report will eventually end up in the
         public bug database anyway, so if at all possible you should use the
         public reporting mechanisms.  One of the strengths of the open-source
         software development model is its transparency.
      </p><p>
         Regardless of how you report your problem, please try to provide as
         much information as possible about the behavior you observed and the
         environment in which the problem behavior occurred.
         <a href="#ftn.idp58736704" class="footnote" name="idp58736704"><sup class="footnote">[2]</sup></a> 
      </p><p>
         In particular, you should provide: the version of Cedar Backup that you
         are using; how you installed Cedar Backup (i.e. Debian package,
         source package, etc.); the exact command line that you executed; any
         error messages you received, including Python stack traces (if any);
         and relevant sections of the Cedar Backup log.  It would be even
         better if you could describe exactly how to reproduce the problem, for
         instance by including your entire configuration file and/or specific
         information about your system that might relate to the problem.
         However, please do <span class="emphasis"><em>not</em></span> provide huge sections of
         debugging logs unless you are sure they are relevant or unless someone
         asks for them.
      </p><div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Tip</h3><p>
            Sometimes, the error that Cedar Backup displays can be rather
            cryptic.  This is because under internal error conditions, the text
            related to an exception might get propogated all of the way up to
            the user interface.  If the message you receive doesn't make much
            sense, or if you suspect that it results from an internal error,
            you might want to re-run Cedar Backup with the
            <code class="option">--stack</code> option.  This forces Cedar Backup to dump
            the entire Python stack trace associated with the error, rather
            than just printing the last message it received.  This is good
            information to include along with a bug report, as well.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-intro-history"></a>History</h2></div></div></div><p>
         Cedar Backup began life in late 2000 as a set of Perl scripts called
         <span class="application">kbackup</span>.   These scripts met an immediate
         need (which was to back up skyjammer.com and some personal machines)
         but proved to be unstable, overly verbose and rather difficult to
         maintain.
      </p><p>
         In early 2002, work began on a rewrite of
         <span class="application">kbackup</span>.  The goal was to address many of
         the shortcomings of the original application, as well as to clean up
         the code and make it available to the general public.  While doing
         research related to code I could borrow or base the rewrite on, I
         discovered that there was already an existing backup package with the
         name <span class="application">kbackup</span>, so I decided to change the
         name to <span class="application">Cedar Backup</span> instead.
      </p><p>
         Because I had become fed up with the prospect of maintaining a large
         volume of Perl code, I decided to abandon that language in favor of
         Python. <a href="#ftn.idp58675024" class="footnote" name="idp58675024"><sup class="footnote">[3]</sup></a> At the time, I chose Python mostly because I was
         interested in learning it, but in retrospect it turned out to be a
         very good decision.  From my perspective, Python has almost all of the
         strengths of Perl, but few of its inherent weaknesses (I feel that
         primarily, Python code often ends up being much more readable than
         Perl code).
      </p><p>
         Around this same time, skyjammer.com and cedar-solutions.com were
         converted to run Debian GNU/Linux (potato) 
         <a href="#ftn.idp58677072" class="footnote" name="idp58677072"><sup class="footnote">[4]</sup></a>
         and I entered the Debian new maintainer queue, so I also made it a
         goal to implement Debian packages along with a Python source
         distribution for the new release.
      </p><p>
         Version 1.0 of <span class="application">Cedar Backup</span> was released in
         June of 2002.  We immediately began using it to back up skyjammer.com
         and cedar-solutions.com, where it proved to be much more stable than
         the original code.  
      </p><p>
         In the meantime, I continued to improve as a Python programmer and
         also started doing a significant amount of professional development in
         Java.  It soon became obvious that the internal structure of
         <span class="application">Cedar Backup</span> 1.0, while much better than
         <span class="application">kbackup</span>, still left something to be
         desired.  In November 2003, I began an attempt at cleaning up the
         codebase.  I converted all of the internal documentation to use
         Epydoc, <a href="#ftn.idp58762048" class="footnote" name="idp58762048"><sup class="footnote">[5]</sup></a>
         and updated the code to use the newly-released Python logging package
         <a href="#ftn.idp58763168" class="footnote" name="idp58763168"><sup class="footnote">[6]</sup></a> after having a good experience with Java's log4j.
         However, I was still not satisfied with the code, which did not lend
         itself to the automated regression testing I had used when working
         with junit in my Java code.
      </p><p>
         So, rather than releasing the cleaned-up code, I instead began another
         ground-up rewrite in May 2004.  With this rewrite, I applied
         everything I had learned from other Java and Python projects I had
         undertaken over the last few years.  I structured the code to take
         advantage of Python's unique ability to blend procedural code with
         object-oriented code, and I made automated unit testing a primary
         requirement.  The result was the 2.0 release, which is cleaner, more
         compact, better focused, and better documented than any release before
         it.  Utility code is less application-specific, and is now usable as a
         general-purpose library.  The 2.0 release also includes a complete
         regression test suite of over 3000 tests, which will help to ensure
         that quality is maintained as development continues into the future.
         <a href="#ftn.idp58765776" class="footnote" name="idp58765776"><sup class="footnote">[7]</sup></a> 
      </p><p>
         The 3.0 release of Cedar Backup is a Python 3 conversion of the 2.0
         release, with minimal additional functionality.  The conversion from
         Python 2 to Python 3 started in mid-2015, about 5 years before the
         anticipated deprecation of Python 2 in 2020.  Most users should
         consider transitioning to the 3.0 release.
      </p></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.idp58733040" class="footnote"><p><a href="#idp58733040" class="para"><sup class="para">[1] </sup></a>See <a class="ulink" href="https://bitbucket.org/cedarsolutions/cedar-backup2/issues" target="_top">https://bitbucket.org/cedarsolutions/cedar-backup2/issues</a>.</p></div><div id="ftn.idp58736704" class="footnote"><p><a href="#idp58736704" class="para"><sup class="para">[2] </sup></a>See Simon Tatham's excellent bug reporting tutorial:
         <a class="ulink" href="http://www.chiark.greenend.org.uk/~sgtatham/bugs.html" target="_top">http://www.chiark.greenend.org.uk/~sgtatham/bugs.html</a>
         .</p></div><div id="ftn.idp58675024" class="footnote"><p><a href="#idp58675024" class="para"><sup class="para">[3] </sup></a>See <a class="ulink" href="http://www.python.org/" target="_top">http://www.python.org/</a>
         .</p></div><div id="ftn.idp58677072" class="footnote"><p><a href="#idp58677072" class="para"><sup class="para">[4] </sup></a>Debian's stable releases are named after characters
         in the Toy Story movie.</p></div><div id="ftn.idp58762048" class="footnote"><p><a href="#idp58762048" class="para"><sup class="para">[5] </sup></a>Epydoc is a Python code documentation tool.
         See <a class="ulink" href="http://epydoc.sourceforge.net/" target="_top">http://epydoc.sourceforge.net/</a>.</p></div><div id="ftn.idp58763168" class="footnote"><p><a href="#idp58763168" class="para"><sup class="para">[6] </sup></a>See <a class="ulink" href="http://docs.python.org/lib/module-logging.html" target="_top">http://docs.python.org/lib/module-logging.html</a>
         .</p></div><div id="ftn.idp58765776" class="footnote"><p><a href="#idp58765776" class="para"><sup class="para">[7] </sup></a>Tests are implemented using Python's unit test
         framework.  See <a class="ulink" href="http://docs.python.org/lib/module-unittest.html" target="_top">http://docs.python.org/lib/module-unittest.html</a>.</p></div></div></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-basic"></a>Chapter 2. Basic Concepts</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-basic-general">General Architecture</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-datarecovery">Data Recovery</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-pools">Cedar Backup Pools</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-process">The Backup Process</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-basic-process-collect">The Collect Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-stage">The Stage Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-store">The Store Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-purge">The Purge Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-all">The All Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-validate">The Validate Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-initialize">The Initialize Action</a></span></dt><dt><span class="sect2"><a href="#cedar-basic-process-rebuild">The Rebuild Action</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-basic-coordinate">Coordination between Master and Clients</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-managedbackups">Managed Backups</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-mediadevice">Media and Device Types</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-incremental">Incremental Backups</a></span></dt><dt><span class="sect1"><a href="#cedar-basic-extensions">Extensions</a></span></dt></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-general"></a>General Architecture</h2></div></div></div><p>
         Cedar Backup is architected as a Python package (library) and a single
         executable (a Python script).  The Python package provides both
         application-specific code and general utilities that can be used by
         programs other than Cedar Backup.  It also includes modules that can
         be used by third parties to extend Cedar Backup or provide related
         functionality.
      </p><p>
         The <span class="command"><strong>cback</strong></span> script is designed to run as root, since
         otherwise it's difficult to back up system directories or write to the
         CD/DVD device.  However, pains are taken to use the backup user's
         effective user id (specified in configuration) when appropriate.
         Note: this does not mean that <span class="command"><strong>cback</strong></span> runs
         <em class="firstterm">setuid</em><a href="#ftn.idp58818672" class="footnote" name="idp58818672"><sup class="footnote">[8]</sup></a> or
         <em class="firstterm">setgid</em>.  However, all files on disk will be
         owned by the backup user, and and all rsh-based network connections
         will take place as the backup user.
      </p><p>
         The <span class="command"><strong>cback</strong></span> script is configured via command-line
         options and an XML configuration file on disk.  The configuration file
         is normally stored in <code class="filename">/etc/cback.conf</code>, but this
         path can be overridden at runtime.   See <a class="xref" href="#cedar-config" title="Chapter 5. Configuration">Chapter 5, <i>Configuration</i></a>
         for more information on how Cedar Backup is configured.
      </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
            You should be aware that backups to CD/DVD media can probably be read
            by any user which has permissions to mount the CD/DVD writer.  If you
            intend to leave the backup disc in the drive at all times, you may
            want to consider this when setting up device permissions on your
            machine.  See also <a class="xref" href="#cedar-extensions-encrypt" title="Encrypt Extension">the section called &#8220;Encrypt Extension&#8221;</a>.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-datarecovery"></a>Data Recovery</h2></div></div></div><p>
         Cedar Backup does not include any facility to restore backups.
         Instead, it assumes that the administrator (using the procedures
         and references in <a class="xref" href="#cedar-recovering" title="Appendix C. Data Recovery">Appendix C, <i>Data Recovery</i></a>) can handle the task of
         restoring their own system, using the standard system tools at hand.
      </p><p>
         If I were to maintain recovery code in Cedar Backup, I would almost
         certainly end up in one of two situations.  Either Cedar Backup would
         only support simple recovery tasks, and those via an interface a lot
         like that of the underlying system tools; or Cedar Backup would have
         to include a hugely complicated interface to support more specialized
         (and hence useful) recovery tasks like restoring individual files as
         of a certain point in time.  In either case, I would end up trying to
         maintain critical functionality that would be rarely used, and hence
         would also be rarely tested by end-users.  I am uncomfortable asking
         anyone to rely on functionality that falls into this category.
      </p><p>
         My primary goal is to keep the Cedar Backup codebase as simple and
         focused as possible.  I hope you can understand how the choice of
         providing documentation, but not code, seems to strike the best
         balance between managing code complexity and providing the
         functionality that end-users need.  
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-pools"></a>Cedar Backup Pools</h2></div></div></div><p>
         There are two kinds of machines in a Cedar Backup pool.  One machine
         (the <em class="firstterm">master</em>) has a CD or DVD writer on it
         and writes the backup to disc.  The others
         (<em class="firstterm">clients</em>) collect data to be written to disc by
         the master.  Collectively, the master and client machines in a pool
         are called <em class="firstterm">peer machines</em>. 
      </p><p>
         Cedar Backup has been designed primarily for situations where there is
         a single master and a set of other clients that the master interacts
         with.  However, it will just as easily work for a single machine (a
         backup pool of one) and in fact more users seem to use it like this
         than any other way.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-process"></a>The Backup Process</h2></div></div></div><p>
         The Cedar Backup backup process is structured in terms of a set of
         decoupled actions which execute independently (based on a schedule in
         <span class="command"><strong>cron</strong></span>) rather than through some highly coordinated
         flow of control.  
      </p><p>
         This design decision has both positive and negative consequences.  On
         the one hand, the code is much simpler and can choose to simply abort
         or log an error if its expectations are not met.  On the other hand,
         the administrator must coordinate the various actions during initial
         set-up.  See <a class="xref" href="#cedar-basic-coordinate" title="Coordination between Master and Clients">the section called &#8220;Coordination between Master and Clients&#8221;</a> (later in this
         chapter) for more information on this subject.
      </p><p>
         A standard backup run consists of four steps (actions), some of which
         execute on the master machine, and some of which execute on one or
         more client machines.  These actions are:
         <em class="firstterm">collect</em>, <em class="firstterm">stage</em>,
         <em class="firstterm">store</em> and <em class="firstterm">purge</em>.
      </p><p>
         In general, more than one action may be specified on the command-line.
         If more than one action is specified, then actions will be taken in a
         sensible order (generally collect, stage, store, purge).   A special
         <em class="firstterm">all</em> action is also allowed, which implies all
         of the standard actions in the same sensible order.
      </p><p>
         The <span class="command"><strong>cback</strong></span> command also supports several actions
         that are not part of the standard backup run and cannot be executed
         along with any other actions.  These actions are
         <em class="firstterm">validate</em>, <em class="firstterm">initialize</em> and
         <em class="firstterm">rebuild</em>.  All of the various actions are
         discussed further below.
      </p><p>
         See <a class="xref" href="#cedar-config" title="Chapter 5. Configuration">Chapter 5, <i>Configuration</i></a> for more information on how a
         backup run is configured.
      </p><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Flexibility</b></p></div></div></div><p>
            Cedar Backup was designed to be flexible.  It allows you to decide
            for yourself which backup steps you care about executing (and when
            you execute them), based on your own situation and your own
            priorities.
         </p><p>
            As an example, I always back up every machine I own.  I typically
            keep 7-10 days of staging directories around, but switch CD/DVD media
            mostly every week.  That way, I can periodically take a disc
            off-site in case the machine gets stolen or damaged.  
         </p><p>
            If you're not worried about these risks, then there's no need to
            write to disc.  In fact, some users prefer to use their master
            machine as a simple <span class="quote">&#8220;<span class="quote">consolidation point</span>&#8221;</span>.  They don't
            back up any data on the master, and don't write to disc at all.
            They just use Cedar Backup to handle the mechanics of moving
            backed-up data to a central location.  This isn't quite what Cedar
            Backup was written to do, but it is flexible enough to meet their
            needs.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-collect"></a>The Collect Action</h3></div></div></div><p>
            The collect action is the first action in a standard backup run.
            It executes on both master and client nodes.  Based on configuration,
            this action traverses the peer's filesystem and gathers files to be
            backed up.  Each configured high-level directory is collected up
            into its own <span class="command"><strong>tar</strong></span> file in the <em class="firstterm">collect
            directory</em>.   The tarfiles can either be uncompressed
            (<code class="filename">.tar</code>) or compressed with either
            <span class="command"><strong>gzip</strong></span> (<code class="filename">.tar.gz</code>) or
            <span class="command"><strong>bzip2</strong></span> (<code class="filename">.tar.bz2</code>).
         </p><p>
            There are three supported collect modes:
            <em class="firstterm">daily</em>, <em class="firstterm">weekly</em> and
            <em class="firstterm">incremental</em>.  Directories configured for
            daily backups are backed up every day.  Directories configured for
            weekly backups are backed up on the first day of the week.
            Directories configured for incremental backups are traversed every
            day, but only the files which have changed (based on a saved-off
            <em class="firstterm">SHA hash</em>) are actually backed up.
         </p><p>
            Collect configuration also allows for a variety of ways to filter
            files and directories out of the backup.  For instance,
            administrators can configure an <em class="firstterm">ignore indicator
            file</em> 
            <a href="#ftn.idp58856304" class="footnote" name="idp58856304"><sup class="footnote">[9]</sup></a> 
            or specify absolute paths or filename patterns 
            <a href="#ftn.idp58857536" class="footnote" name="idp58857536"><sup class="footnote">[10]</sup></a>
            to be excluded.  You can even configure a backup <span class="quote">&#8220;<span class="quote">link
            farm</span>&#8221;</span> rather than explicitly listing files and directories
            in configuration.
         </p><p>
            This action is optional on the master.  You only need to configure
            and execute the collect action on the master if you have data to
            back up on that machine.  If you plan to use the master only as a
            <span class="quote">&#8220;<span class="quote">consolidation point</span>&#8221;</span> to collect data from other
            machines, then there is no need to execute the collect action
            there.  If you run the collect action on the master, it behaves the
            same there as anywhere else, and you have to stage the master's
            collected data just like any other client (typically by configuring
            a local peer in the stage action).
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-stage"></a>The Stage Action</h3></div></div></div><p>
            The stage action is the second action in a standard backup run.  It
            executes on the master peer node.  The master works down the list of
            peers in its backup pool and stages (copies) the collected backup
            files from each of them into a daily staging directory by peer
            name.
         </p><p>
            For the purposes of this action, the master node can be configured
            to treat itself as a client node.  If you intend to back up data on
            the master, configure the master as a local peer.  Otherwise, just
            configure each of the clients as a remote peer.
         </p><p> 
            Local and remote client peers are treated differently.  Local peer
            collect directories are assumed to be accessible via normal copy
            commands (i.e. on a mounted filesystem) while remote peer collect
            directories are accessed via an <em class="firstterm">RSH-compatible</em>
            command such as <span class="command"><strong>ssh</strong></span>.
         </p><p>
            If a given peer is not ready to be staged, the stage process will
            log an error, abort the backup for that peer, and then move on to
            its other peers.  This way, one broken peer cannot break a backup
            for other peers which are up and running.
         </p><p>
            Keep in mind that Cedar Backup is flexible about what actions must
            be executed as part of a backup.  If you would prefer, you can stop
            the backup process at this step, and skip the store step.  In this
            case, the staged directories will represent your backup rather than
            a disc.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               Directories <span class="quote">&#8220;<span class="quote">collected</span>&#8221;</span> by another process can be
               staged by Cedar Backup.  If the file
               <code class="filename">cback.collect</code> exists in a collect directory
               when the stage action is taken, then that directory will be
               staged.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-store"></a>The Store Action</h3></div></div></div><p>
            The store action is the third action in a standard backup run.  It
            executes on the master peer node.  The master machine determines the
            location of the current staging directory, and then writes the
            contents of that staging directory to disc.  After the contents of
            the directory have been written to disc, an optional validation
            step ensures that the write was successful.
         </p><p>
            If the backup is running on the first day of the week, if the drive
            does not support multisession discs, or if the
            <code class="option">--full</code> option is passed to the
            <span class="command"><strong>cback</strong></span> command, the disc will be rebuilt from
            scratch.   Otherwise, a new ISO session will be added to the disc
            each day the backup runs.  
         </p><p>
            This action is entirely optional.  If you would prefer to just
            stage backup data from a set of peers to a master machine, and have
            the staged directories represent your backup rather than a disc,
            this is fine. 
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               The store action is not supported on the Mac OS X (darwin)
               platform.  On that platform, the <span class="quote">&#8220;<span class="quote">automount</span>&#8221;</span>
               function of the Finder interferes significantly with Cedar
               Backup's ability to mount and unmount media and write to the CD
               or DVD hardware.  The Cedar Backup writer and image
               functionality works on this platform, but the effort required to
               fight the operating system about who owns the media and the
               device makes it nearly impossible to execute the store action
               successfully.
            </p></div><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Current Staging Directory</b></p></div></div></div><p>
               The store action tries to be smart about finding the current
               staging directory.  It first checks the current day's staging
               directory.  If that directory exists, and it has not yet been
               written to disc (i.e. there is no store indicator), then it will
               be used.  Otherwise, the store action will look for an unused
               staging directory for either the previous day or the next day,
               in that order.  A warning will be written to the log under 
               these circumstances (controlled by the &lt;warn_midnite&gt;
               configuration value).
            </p><p>
               This behavior varies slightly when the <code class="option">--full</code>
               option is in effect.  Under these circumstances, any existing
               store indicator will be ignored.  Also, the store action will
               always attempt to use the current day's staging directory,
               ignoring any staging directories for the previous day or the
               next day.  This way, running a full store action more than once
               concurrently will always produce the same results.  (You might
               imagine a use case where a person wants to make several copies
               of the same full backup.)
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-purge"></a>The Purge Action</h3></div></div></div><p>
            The purge action is the fourth and final action in a standard
            backup run.  It executes both on the master and client peer nodes.
            Configuration specifies how long to retain files in certain
            directories, and older files and empty directories are purged.
         </p><p>
            Typically, collect directories are purged daily, and stage
            directories are purged weekly or slightly less often (if a disc
            gets corrupted, older backups may still be available on the
            master).  Some users also choose to purge the configured working
            directory (which is used for temporary files) to eliminate any
            leftover files which might have resulted from changes to
            configuration.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-all"></a>The All Action</h3></div></div></div><p>
            The all action is a pseudo-action which causes all of the actions
            in a standard backup run to be executed together in order.  It
            cannot be combined with any other actions on the command line.
         </p><p>
            Extensions <span class="emphasis"><em>cannot</em></span> be executed as part of the
            all action.  If you need to execute an extended action, you must
            specify the other actions you want to run individually on the
            command line.  <a href="#ftn.idp58884208" class="footnote" name="idp58884208"><sup class="footnote">[11]</sup></a>
         </p><p>
            The all action does not have its own configuration.  Instead, it
            relies on the individual configuration sections for all of the
            other actions.  
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-validate"></a>The Validate Action</h3></div></div></div><p>
            The validate action is used to validate configuration
            on a particular peer node, either master or client.   It cannot be
            combined with any other actions on the command line.
         </p><p>
            The validate action checks that the configuration file can be
            found, that the configuration file is valid, and that certain
            portions of the configuration file make sense (for instance, making
            sure that specified users exist, directories are readable and
            writable as necessary, etc.).
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-initialize"></a>The Initialize Action</h3></div></div></div><p>
            The initialize action is used to initialize media for use with
            Cedar Backup.  This is an optional step.  By default, Cedar Backup
            does not need to use initialized media and will write to whatever
            media exists in the writer device.  
         </p><p>
            However, if the <span class="quote">&#8220;<span class="quote">check media</span>&#8221;</span> store configuration
            option is set to true, Cedar Backup will check the media before
            writing to it and will error out if the media has not been
            initialized. 
         </p><p>
            Initializing the media consists of writing a mostly-empty image
            using a known media label (the media label will begin with
            <span class="quote">&#8220;<span class="quote">CEDAR BACKUP</span>&#8221;</span>).
         </p><p>
            Note that only rewritable media (CD-RW, DVD+RW) can be initialized.
            It doesn't make any sense to initialize media that cannot be
            rewritten (CD-R, DVD+R), since Cedar Backup would then not be able
            to use that media for a backup.  You can still configure Cedar
            Backup to check non-rewritable media; in this case, the check will
            also pass if the media is apparently unused (i.e. has no media
            label).
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-basic-process-rebuild"></a>The Rebuild Action</h3></div></div></div><p>
            The rebuild action is an exception-handling action that is executed
            independent of a standard backup run.  It cannot be combined with
            any other actions on the command line.
         </p><p>
            The rebuild action attempts to rebuild <span class="quote">&#8220;<span class="quote">this week's</span>&#8221;</span>
            disc from any remaining unpurged staging directories.  Typically,
            it is used to make a copy of a backup, replace lost or damaged
            media, or to switch to new media mid-week for some other reason.
         </p><p>
            To decide what data to write to disc again, the rebuild action
            looks back and finds the first day of the current week.  Then, it finds
            any remaining staging directories between that date and the current
            date.  If any staging directories are found, they are all written
            to disc in one big ISO session.
         </p><p>
            The rebuild action does not have its own configuration.  It relies
            on configuration for other other actions, especially the store
            action.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-coordinate"></a>Coordination between Master and Clients</h2></div></div></div><p>
         Unless you are using Cedar Backup to manage a <span class="quote">&#8220;<span class="quote">pool of
         one</span>&#8221;</span>, you will need to set up some coordination between your
         clients and master to make everything work properly.  This
         coordination isn't difficult &#8212; it mostly consists of making sure
         that operations happen in the right order &#8212; but some users are
         suprised that it is required and want to know why Cedar Backup can't
         just <span class="quote">&#8220;<span class="quote">take care of it for me</span>&#8221;</span>.
      </p><p>
         Essentially, each client must finish collecting all of its data before
         the master begins staging it, and the master must finish staging data
         from a client before that client purges its collected data.
         Administrators may need to experiment with the time between the
         collect and purge entries so that the master has enough time to stage
         data before it is purged.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-managedbackups"></a>Managed Backups</h2></div></div></div><p>
         Cedar Backup also supports an optional feature called the
         <span class="quote">&#8220;<span class="quote">managed backup</span>&#8221;</span>.  This feature is intended for use with
         remote clients where cron is not available.
      </p><p>
         When managed backups are enabled, managed clients must still be
         configured as usual.  However, rather than using a cron job on the
         client to execute the collect and purge actions, the master executes
         these actions on the client via a remote shell.  
      </p><p>
         To make this happen, first set up one or more managed clients in Cedar
         Backup configuration.  Then, invoke Cedar Backup with the
         <span class="command"><strong>--managed</strong></span> command-line option.  Whenever Cedar
         Backup invokes an action locally, it will invoke the same action on
         each of the managed clients.
      </p><p>
         Technically, this feature works for any client, not just clients that
         don't have cron available.  Used this way, it can simplify the setup
         process, because cron only has to be configured on the master.  For
         some users, that may be motivation enough to use this feature all of
         the time.
      </p><p>
         However, please keep in mind that this feature depends on a stable
         network.  If your network connection drops, your backup will be
         interrupted and will not be complete.  It is even possible that some
         of the Cedar Backup metadata (like incremental backup state) will be
         corrupted.  The risk is not high, but it is something you need to be
         aware of if you choose to use this optional feature.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-mediadevice"></a>Media and Device Types</h2></div></div></div><p>
         Cedar Backup is focused around writing backups to CD or DVD media
         using a standard SCSI or IDE writer.  In Cedar Backup terms, the
         disc itself is referred to as the <em class="firstterm">media</em>, and
         the CD/DVD drive is referred to as the <em class="firstterm">device</em> 
         or sometimes the <em class="firstterm">backup device</em>.
         <a href="#ftn.idp58778480" class="footnote" name="idp58778480"><sup class="footnote">[12]</sup></a>
      </p><p>
         When using a new enough backup device, a new
         <span class="quote">&#8220;<span class="quote">multisession</span>&#8221;</span> ISO image <a href="#ftn.idp58779936" class="footnote" name="idp58779936"><sup class="footnote">[13]</sup></a> 
         is written to the media on the first day of the week, and then
         additional multisession images are added to the media each day that
         Cedar Backup runs.  This way, the media is complete and usable at the
         end of every backup run, but a single disc can be used all week long.
         If your backup device does not support multisession images &#8212; which is
         really unusual today &#8212; then a new ISO image will be written to the
         media each time Cedar Backup runs (and you should probably confine
         yourself to the <span class="quote">&#8220;<span class="quote">daily</span>&#8221;</span> backup mode to avoid losing
         data).
      </p><p>
         Cedar Backup currently supports four different kinds of CD media:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term">cdr-74</span></dt><dd><p>74-minute non-rewritable CD media</p></dd><dt><span class="term">cdrw-74</span></dt><dd><p>74-minute rewritable CD media</p></dd><dt><span class="term">cdr-80</span></dt><dd><p>80-minute non-rewritable CD media</p></dd><dt><span class="term">cdrw-80</span></dt><dd><p>80-minute rewritable CD media</p></dd></dl></div><p>
         I have chosen to support just these four types of CD media because
         they seem to be the most <span class="quote">&#8220;<span class="quote">standard</span>&#8221;</span> of the various types
         commonly sold in the U.S. as of this writing (early 2005).  If you
         regularly use an unsupported media type and would like Cedar Backup to
         support it, send me information about the capacity of the media in
         megabytes (MB) and whether it is rewritable.  
      </p><p>
         Cedar Backup also supports two kinds of DVD media:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term">dvd+r</span></dt><dd><p>Single-layer non-rewritable DVD+R media</p></dd><dt><span class="term">dvd+rw</span></dt><dd><p>Single-layer rewritable DVD+RW media</p></dd></dl></div><p>
         The underlying <span class="command"><strong>growisofs</strong></span> utility does support other
         kinds of media (including DVD-R, DVD-RW and BlueRay) which work
         somewhat differently than standard DVD+R and DVD+RW media.  I don't
         support these other kinds of media because I haven't had any
         opportunity to work with them.  The same goes for dual-layer media of
         any type.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-incremental"></a>Incremental Backups</h2></div></div></div><p>
         Cedar Backup supports three different kinds of backups for individual
         collect directories.  These are <em class="firstterm">daily</em>,
         <em class="firstterm">weekly</em> and <em class="firstterm">incremental</em>
         backups.  Directories using the daily mode are backed up every day.
         Directories using the weekly mode are only backed up on the first day
         of the week, or when the <code class="option">--full</code> option is used.
         Directories using the incremental mode are always backed up on the
         first day of the week (like a weekly backup), but after that only the
         files which have changed are actually backed up on a daily basis.
      </p><p>
         In Cedar Backup, incremental backups are not based on date, but are
         instead based on saved checksums, one for each backed-up file.
         When a full backup is run, Cedar Backup gathers a checksum value
         <a href="#ftn.idp58942368" class="footnote" name="idp58942368"><sup class="footnote">[14]</sup></a> 
         for each backed-up file.  The next time an incremental backup is run,
         Cedar Backup checks its list of file/checksum pairs for each file that
         might be backed up.  If the file's checksum value does not
         match the saved value, or if the file does not appear in the list
         of file/checksum pairs, then it will be backed up and a new checksum
         value will be placed into the list.  Otherwise, the file will be
         ignored and the checksum value will be left unchanged.
      </p><p>
         Cedar Backup stores the file/checksum pairs in
         <code class="filename">.sha</code> files in its working directory, one file per
         configured collect directory.  The mappings in these files are reset
         at the start of the week or when the <code class="option">--full</code> option is
         used.  Because these files are used for an entire week, you should
         never purge the working directory more frequently than once per week.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-basic-extensions"></a>Extensions</h2></div></div></div><p>
         Imagine that there is a third party developer who understands how to
         back up a certain kind of database repository.  This third party
         might want to integrate his or her specialized backup into the Cedar
         Backup process, perhaps thinking of the database backup as a sort of
         <span class="quote">&#8220;<span class="quote">collect</span>&#8221;</span> step.
      </p><p>
         Prior to Cedar Backup version 2, any such integration would have been
         completely independent of Cedar Backup itself.  The
         <span class="quote">&#8220;<span class="quote">external</span>&#8221;</span> backup functionality would have had to
         maintain its own configuration and would not have had access to any
         Cedar Backup configuration.
      </p><p>
         Starting with version 2, Cedar Backup allows
         <em class="firstterm">extensions</em> to the backup process.   An
         extension is an action that isn't part of the standard backup process
         (i.e. not collect, stage, store or purge), but can be executed by Cedar
         Backup when properly configured.
      </p><p>
         Extension authors implement an <span class="quote">&#8220;<span class="quote">action process</span>&#8221;</span> function
         with a certain interface, and are allowed to add their own sections to
         the Cedar Backup configuration file, so that all backup configuration
         can be centralized.  Then, the action process function is associated
         with an action name which can be executed from the
         <span class="command"><strong>cback</strong></span> command line like any other action.
      </p><p>
         Hopefully, as the Cedar Backup user community grows, users will
         contribute their own extensions back to the community.  Well-written
         general-purpose extensions will be accepted into the official
         codebase.
      </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
            Users should see <a class="xref" href="#cedar-config" title="Chapter 5. Configuration">Chapter 5, <i>Configuration</i></a> for more
            information on how extensions are configured, and <a class="xref" href="#cedar-extensions" title="Chapter 6. Official Extensions">Chapter 6, <i>Official Extensions</i></a> for details on all of the
            officially-supported extensions.  
         </p><p>
            Developers may be interested in <a class="xref" href="#cedar-extenspec" title="Appendix A. Extension Architecture Interface">Appendix A, <i>Extension Architecture Interface</i></a>.
         </p></div></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.idp58818672" class="footnote"><p><a href="#idp58818672" class="para"><sup class="para">[8] </sup></a>See <a class="ulink" href="http://en.wikipedia.org/wiki/Setuid" target="_top">http://en.wikipedia.org/wiki/Setuid</a></p></div><div id="ftn.idp58856304" class="footnote"><p><a href="#idp58856304" class="para"><sup class="para">[9] </sup></a>Analagous to <code class="filename">.cvsignore</code> in CVS</p></div><div id="ftn.idp58857536" class="footnote"><p><a href="#idp58857536" class="para"><sup class="para">[10] </sup></a>In terms of Python regular expressions</p></div><div id="ftn.idp58884208" class="footnote"><p><a href="#idp58884208" class="para"><sup class="para">[11] </sup></a>Some users find this surprising,
            because extensions are configured with sequence numbers.  I did it
            this way because I felt that running extensions as part of the all
            action would sometimes result in surprising behavior.  I am not
            planning to change the way this works.</p></div><div id="ftn.idp58778480" class="footnote"><p><a href="#idp58778480" class="para"><sup class="para">[12] </sup></a>My original backup device was an old
         Sony CRX140E 4X CD-RW drive.  It has since died, and I currently
         develop using a Lite-On 1673S DVD±RW drive.</p></div><div id="ftn.idp58779936" class="footnote"><p><a href="#idp58779936" class="para"><sup class="para">[13] </sup></a>An
         <em class="firstterm">ISO image</em> is the standard way of creating a
         filesystem to be copied to a CD or DVD.  It is essentially a
         <span class="quote">&#8220;<span class="quote">filesystem-within-a-file</span>&#8221;</span> and many UNIX operating
         systems can actually mount ISO image files just like hard drives,
         floppy disks or actual CDs.  See Wikipedia for more information:
         <a class="ulink" href="http://en.wikipedia.org/wiki/ISO_image" target="_top">http://en.wikipedia.org/wiki/ISO_image</a>.</p></div><div id="ftn.idp58942368" class="footnote"><p><a href="#idp58942368" class="para"><sup class="para">[14] </sup></a>The checksum is actually an <em class="firstterm">SHA
         cryptographic hash</em>.  See Wikipedia for more information:
         <a class="ulink" href="http://en.wikipedia.org/wiki/SHA-1" target="_top">http://en.wikipedia.org/wiki/SHA-1</a>.</p></div></div></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-install"></a>Chapter 3. Installation</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-install-background">Background</a></span></dt><dt><span class="sect1"><a href="#cedar-install-debian">Installing on a Debian System</a></span></dt><dt><span class="sect1"><a href="#cedar-install-source">Installing from Source</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-install-source-deps">Installing Dependencies</a></span></dt><dt><span class="sect2"><a href="#cedar-install-source-package">Installing the Source Package</a></span></dt></dl></dd></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-install-background"></a>Background</h2></div></div></div><p>
         There are two different ways to install Cedar Backup.  The easiest way
         is to install the pre-built Debian packages.  This method is painless
         and ensures that all of the correct dependencies are available, etc.
      </p><p>
         If you are running a Linux distribution other than Debian or you are
         running some other platform like FreeBSD or Mac OS X, then you must use the
         Python source distribution to install Cedar Backup.  When using this
         method, you need to manage all of the dependencies yourself.
      </p><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Non-Linux Platforms</b></p></div></div></div><p>
            Cedar Backup has been developed on a Debian GNU/Linux system and is
            primarily supported on Debian and other Linux systems.  However,
            since it is written in portable Python 2, it should run without
            problems on just about any UNIX-like operating system.  In
            particular, full Cedar Backup functionality is known to work on
            Debian and SuSE Linux systems, and client functionality is also
            known to work on FreeBSD and Mac OS X systems.
         </p><p>
            To run a Cedar Backup client, you really just need a working Python 2
            installation.  To run a Cedar Backup master, you will also need a set
            of other executables, most of which are related to building and
            writing CD/DVD images.  A full list of dependencies is provided further
            on in this chapter.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-install-debian"></a>Installing on a Debian System</h2></div></div></div><p>
         The easiest way to install Cedar Backup onto a Debian system is by
         using a tool such as <span class="command"><strong>apt-get</strong></span> or
         <span class="command"><strong>aptitude</strong></span>.
      </p><p>
         If you are running a Debian release which contains Cedar Backup, you
         can use your normal Debian mirror as an APT data source. (The Debian
         <span class="quote">&#8220;<span class="quote">etch</span>&#8221;</span> release is the first release to contain Cedar
         Backup 2.) Otherwise, you need to install from the Cedar Solutions APT
         data source.  
         <a href="#ftn.cedar-install-foot-software" class="footnote" name="cedar-install-foot-software"><sup class="footnote">[15]</sup></a>
         To do this, add the Cedar Solutions APT data source to
         your <code class="filename">/etc/apt/sources.list</code> file.
      </p><p>
         After you have configured the proper APT data source, install Cedar
         Backup using this set of commands:
      </p><pre class="screen">
$ apt-get update
$ apt-get install cedar-backup2 cedar-backup2-doc
      </pre><p>
         Several of the Cedar Backup dependencies are listed as
         <span class="quote">&#8220;<span class="quote">recommended</span>&#8221;</span> rather than required.  If you are
         installing Cedar Backup on a master machine, you must install some or
         all of the recommended dependencies, depending on which actions you
         intend to execute.  The stage action normally requires ssh, and the
         store action requires eject and either cdrecord/mkisofs or
         dvd+rw-tools.  Clients must also install some sort of ssh
         server if a remote master will collect backups from them.
      </p><p>
         If you would prefer, you can also download the
         <code class="filename">.deb</code> files and install them by hand with a tool
         such as <span class="command"><strong>dpkg</strong></span>.  You can find 
         these files files in the Cedar Solutions APT source.
      </p><p>
         In either case, once the package has been installed, you can proceed
         to configuration as described in <a class="xref" href="#cedar-config" title="Chapter 5. Configuration">Chapter 5, <i>Configuration</i></a>.
      </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
            The Debian package-management tools must generally be run as root.
            It is safe to install Cedar Backup to a non-standard location and
            run it as a non-root user.  However, to do this, you must install
            the source distribution instead of the Debian package.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-install-source"></a>Installing from Source</h2></div></div></div><p>
         On platforms other than Debian, Cedar Backup is installed from a
         Python source distribution. <a href="#ftn.idp59109168" class="footnote" name="idp59109168"><sup class="footnote">[16]</sup></a> You will have to manage dependencies on your own.
      </p><div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Tip</h3><p>
            Many UNIX-like distributions provide an automatic or semi-automatic
            way to install packages like the ones Cedar Backup requires (think
            RPMs for Mandrake or RedHat, Gentoo's Portage system, the Fink
            project for Mac OS X, or the BSD ports system).  If you are not
            sure how to install these packages on your system, you might want
            to check out <a class="xref" href="#cedar-depends" title="Appendix B. Dependencies">Appendix B, <i>Dependencies</i></a>.  This appendix
            provides links to <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> source packages, plus as
            much information as I have been able to gather about packages for
            non-Debian platforms.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-install-source-deps"></a>Installing Dependencies</h3></div></div></div><p>
            Cedar Backup requires a number of external packages in order to
            function properly.  Before installing Cedar Backup, you must make
            sure that these dependencies are met.  
         </p><p>
            Cedar Backup is written in Python 2 and requires version 2.7 or
            greater of the language.  Python 2.7 was originally released on
            4 Jul 2010, and is the last supported release of Python 2. As
            of this writing, all current Linux and BSD distributions
            include it.  You must install Python 2 on every peer node in a
            pool (master or client). 
         </p><p>
            Additionally, remote client peer nodes must be running an
            <em class="firstterm">RSH-compatible</em> server, such as the
            <span class="command"><strong>ssh</strong></span> server, and master nodes must have an
            RSH-compatible client installed if they need to connect to remote
            peer machines.
         </p><p>
            Master machines also require several other system utilities, most
            having to do with writing and validating CD/DVD media.  On master
            machines, you must make sure that these utilities are available if
            you want to to run the store action:
         </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="command"><strong>mkisofs</strong></span></p></li><li class="listitem"><p><span class="command"><strong>eject</strong></span></p></li><li class="listitem"><p><span class="command"><strong>mount</strong></span></p></li><li class="listitem"><p><span class="command"><strong>unmount</strong></span></p></li><li class="listitem"><p><span class="command"><strong>volname</strong></span></p></li></ul></div><p>
            Then, you need this utility if you are writing CD media:
         </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="command"><strong>cdrecord</strong></span></p></li></ul></div><p>
            <span class="emphasis"><em>or</em></span> these utilities if you are writing DVD
            media:
         </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="command"><strong>growisofs</strong></span></p></li></ul></div><p>
            All of these utilities are common and are easy to find for almost
            any UNIX-like operating system.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-install-source-package"></a>Installing the Source Package</h3></div></div></div><p>
            Python source packages are fairly easy to install.  They are
            distributed as <code class="filename">.tar.gz</code> files which contain
            Python source code, a manifest and an installation script called
            <code class="filename">setup.py</code>.  
         </p><p>
            Once you have downloaded the source package from the Cedar
            Solutions website, <a href="#ftn.cedar-install-foot-software" class="footnoteref"><sup class="footnoteref">[15]</sup></a> untar it:
         </p><pre class="screen">
$ zcat CedarBackup2-2.0.0.tar.gz | tar xvf -
         </pre><p>
            This will create a directory called (in this case)
            <code class="filename">CedarBackup2-2.0.0</code>.  The version number in the
            directory will always match the version number in the filename.
         </p><p>
            If you have root access and want to install the package to the
            <span class="quote">&#8220;<span class="quote">standard</span>&#8221;</span> Python location on your system, then you
            can install the package in two simple steps:
         </p><pre class="screen">
$ cd CedarBackup2-2.0.0
$ python setup.py install
         </pre><p>
            Make sure that you are using Python 2.7 or better to execute
            <code class="filename">setup.py</code>.
         </p><p>
            You may also wish to run the unit tests before actually installing
            anything.  Run them like so:
         </p><pre class="screen">
python util/test.py
         </pre><p>
            If any unit test reports a failure on your system, please email me the
            output from the unit test, so I can fix the problem.
            <a href="#ftn.idp59145120" class="footnote" name="idp59145120"><sup class="footnote">[17]</sup></a>
            This is particularly important for non-Linux platforms where I do
            not have a test system available to me.
         </p><p>
            Some users might want to choose a different install location or
            change other install parameters.  To get more information about how
            <code class="filename">setup.py</code> works, use the
            <code class="option">--help</code> option:
         </p><pre class="screen">
$ python setup.py --help
$ python setup.py install --help
         </pre><p>
            In any case, once the package has been installed, you can proceed
            to configuration as described in <a class="xref" href="#cedar-config" title="Chapter 5. Configuration">Chapter 5, <i>Configuration</i></a>.
         </p></div></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.cedar-install-foot-software" class="footnote"><p><a href="#cedar-install-foot-software" class="para"><sup class="para">[15] </sup></a>See <a class="ulink" href="http://cedar-solutions.com/debian.html" target="_top">http://cedar-solutions.com/debian.html</a></p></div><div id="ftn.idp59109168" class="footnote"><p><a href="#idp59109168" class="para"><sup class="para">[16] </sup></a>See <a class="ulink" href="http://docs.python.org/lib/module-distutils.html" target="_top">http://docs.python.org/lib/module-distutils.html</a>
         .</p></div><div id="ftn.idp59145120" class="footnote"><p><a href="#idp59145120" class="para"><sup class="para">[17] </sup></a><code class="email">&lt;<a class="email" href="mailto:support@cedar-solutions.com">support@cedar-solutions.com</a>&gt;</code></p></div></div></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-commandline"></a>Chapter 4. Command Line Tools</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-commandline-overview">Overview</a></span></dt><dt><span class="sect1"><a href="#cedar-commandline-cback">The <span class="command"><strong>cback</strong></span> command</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-commandline-cback-intro">Introduction</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cback-syntax">Syntax</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cback-options">Switches</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cback-actions">Actions</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-commandline-sync">The <span class="command"><strong>cback-amazons3-sync</strong></span> command</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-commandline-sync-intro">Introduction</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-sync-syntax">Syntax</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-sync-options">Switches</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-commandline-cbackspan">The <span class="command"><strong>cback-span</strong></span> command</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-intro">Introduction</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-syntax">Syntax</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-options">Switches</a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-using">Using <span class="command"><strong>cback-span</strong></span></a></span></dt><dt><span class="sect2"><a href="#cedar-commandline-cbackspan-sample">Sample run</a></span></dt></dl></dd></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-commandline-overview"></a>Overview</h2></div></div></div><p>
         Cedar Backup comes with three command-line programs:
         <span class="command"><strong>cback</strong></span>, <span class="command"><strong>cback-amazons3-sync</strong></span>, and
         <span class="command"><strong>cback-span</strong></span>.  
      </p><p>
         The <span class="command"><strong>cback</strong></span> command is the primary command line
         interface and the only Cedar Backup program that most users will ever
         need.  
      </p><p>
         The <span class="command"><strong>cback-amazons3-sync</strong></span> tool is used for
         synchronizing entire directories of files up to an Amazon S3 cloud
         storage bucket, outside of the normal Cedar Backup process. 
      </p><p>
         Users who have a <span class="emphasis"><em>lot</em></span> of data to back up &#8212;
         more than will fit on a single CD or DVD &#8212; can use the
         interactive <span class="command"><strong>cback-span</strong></span> tool to split their data
         between multiple discs.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-commandline-cback"></a>The <span class="command"><strong>cback</strong></span> command</h2></div></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cback-intro"></a>Introduction</h3></div></div></div><p>
            Cedar Backup's primary command-line interface is the
            <span class="command"><strong>cback</strong></span> command.  It controls the entire
            backup process.  
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cback-syntax"></a>Syntax</h3></div></div></div><p>
            The <span class="command"><strong>cback</strong></span> command has the following syntax:
         </p><pre class="screen">
 Usage: cback [switches] action(s)

 The following switches are accepted:

   -h, --help         Display this usage/help listing
   -V, --version      Display version information
   -b, --verbose      Print verbose output as well as logging to disk
   -q, --quiet        Run quietly (display no output to the screen)
   -c, --config       Path to config file (default: /etc/cback.conf)
   -f, --full         Perform a full backup, regardless of configuration
   -M, --managed      Include managed clients when executing actions
   -N, --managed-only Include ONLY managed clients when executing actions
   -l, --logfile      Path to logfile (default: /var/log/cback.log)
   -o, --owner        Logfile ownership, user:group (default: root:adm)
   -m, --mode         Octal logfile permissions mode (default: 640)
   -O, --output       Record some sub-command (i.e. cdrecord) output to the log
   -d, --debug        Write debugging information to the log (implies --output)
   -s, --stack        Dump a Python stack trace instead of swallowing exceptions
   -D, --diagnostics  Print runtime diagnostics to the screen and exit

 The following actions may be specified:

   all                Take all normal actions (collect, stage, store, purge)
   collect            Take the collect action
   stage              Take the stage action
   store              Take the store action
   purge              Take the purge action
   rebuild            Rebuild "this week's" disc if possible
   validate           Validate configuration only
   initialize         Initialize media for use with Cedar Backup

 You may also specify extended actions that have been defined in
 configuration.

 You must specify at least one action to take.  More than one of
 the "collect", "stage", "store" or "purge" actions and/or
 extended actions may be specified in any arbitrary order; they
 will be executed in a sensible order.  The "all", "rebuild",
 "validate", and "initialize" actions may not be combined with
 other actions.
         </pre><p>
            Note that the all action <span class="emphasis"><em>only</em></span> executes the
            standard four actions.  It never executes any of the configured
            extensions.  <a href="#ftn.idp59221312" class="footnote" name="idp59221312"><sup class="footnote">[18]</sup></a>
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cback-options"></a>Switches</h3></div></div></div><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="option">-h</code>, <code class="option">--help</code></span></dt><dd><p>Display usage/help listing.</p></dd><dt><span class="term"><code class="option">-V</code>, <code class="option">--version</code></span></dt><dd><p>Display version information.</p></dd><dt><span class="term"><code class="option">-b</code>, <code class="option">--verbose</code></span></dt><dd><p>
                     Print verbose output to the screen as well writing to the
                     logfile.  When this option is enabled, most information
                     that would normally be written to the logfile will also be
                     written to the screen.
                  </p></dd><dt><span class="term"><code class="option">-q</code>, <code class="option">--quiet</code></span></dt><dd><p>Run quietly (display no output to the screen).</p></dd><dt><span class="term"><code class="option">-c</code>, <code class="option">--config</code></span></dt><dd><p>
                     Specify the path to an alternate configuration file.
                     The default configuration file is <code class="filename">/etc/cback.conf</code>.
                  </p></dd><dt><span class="term"><code class="option">-f</code>, <code class="option">--full</code></span></dt><dd><p>
                    Perform a full backup, regardless of configuration.  For
                    the collect action, this means that any existing
                    information related to incremental backups will be ignored
                    and rewritten; for the store action, this means that a new
                    disc will be started.
                  </p></dd><dt><span class="term"><code class="option">-M</code>, <code class="option">--managed</code></span></dt><dd><p>
                     Include managed clients when executing actions.  If the
                     action being executed is listed as a managed action for a
                     managed client, execute the action on that client after
                     executing the action locally.
                  </p></dd><dt><span class="term"><code class="option">-N</code>, <code class="option">--managed-only</code></span></dt><dd><p>
                     Include <span class="emphasis"><em>only</em></span> managed clients when
                     executing actions.  If the action being executed is listed
                     as a managed action for a managed client, execute the action
                     on that client &#8212; but <span class="emphasis"><em>do not</em></span>
                     execute the action locally.
                  </p></dd><dt><span class="term"><code class="option">-l</code>, <code class="option">--logfile</code></span></dt><dd><p>
                     Specify the path to an alternate logfile.  The default
                     logfile file is <code class="filename">/var/log/cback.log</code>.
                  </p></dd><dt><span class="term"><code class="option">-o</code>, <code class="option">--owner</code></span></dt><dd><p>
                    Specify the ownership of the logfile, in the form
                    <code class="literal">user:group</code>.  The default ownership is
                    <code class="literal">root:adm</code>, to match the Debian standard
                    for most logfiles. This value will only be used when
                    creating a new logfile.  If the logfile already exists when
                    the <span class="command"><strong>cback</strong></span> command is executed, it will
                    retain its existing ownership and mode. Only user and group
                    names may be used, not numeric uid and gid values.
                  </p></dd><dt><span class="term"><code class="option">-m</code>, <code class="option">--mode</code></span></dt><dd><p>
                    Specify  the  permissions  for  the logfile, using the
                    numeric mode as in chmod(1).  The default mode is
                    <code class="literal">0640</code> (<code class="literal">-rw-r-----</code>).
                    This value will only be used when creating a new logfile.
                    If the logfile already exists when the
                    <span class="command"><strong>cback</strong></span> command is executed, it will retain
                    its existing ownership and mode.
                  </p></dd><dt><span class="term"><code class="option">-O</code>, <code class="option">--output</code></span></dt><dd><p>
                     Record some sub-command output to the logfile.  When this
                     option is enabled, all output from system commands will be
                     logged.  This might be useful for debugging or just for
                     reference.
                  </p></dd><dt><span class="term"><code class="option">-d</code>, <code class="option">--debug</code></span></dt><dd><p>
                     Write debugging information to the logfile.  This option
                     produces a high volume of output, and would generally only
                     be needed when debugging a problem.  This option implies
                     the <code class="option">--output</code> option, as well.
                  </p></dd><dt><span class="term"><code class="option">-s</code>, <code class="option">--stack</code></span></dt><dd><p>
                     Dump a Python stack trace instead of swallowing
                     exceptions.  This forces Cedar Backup to dump the entire
                     Python stack trace associated with an error, rather than
                     just propagating last message it received back up to the
                     user interface.  Under some circumstances, this is useful
                     information to include along with a bug report.
                  </p></dd><dt><span class="term"><code class="option">-D</code>, <code class="option">--diagnostics</code></span></dt><dd><p>
                     Display runtime diagnostic information and then exit.
                     This diagnostic information is often useful when filing a
                     bug report.
                  </p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cback-actions"></a>Actions</h3></div></div></div><p>
            You can find more information about the various actions in <a class="xref" href="#cedar-basic-process" title="The Backup Process">the section called &#8220;The Backup Process&#8221;</a> (in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>).
            In general, you may specify any combination of the
            <code class="literal">collect</code>, <code class="literal">stage</code>,
            <code class="literal">store</code> or <code class="literal">purge</code> actions, and
            the specified actions will be executed in a sensible order.  Or,
            you can specify one of the <code class="literal">all</code>,
            <code class="literal">rebuild</code>, <code class="literal">validate</code>, or
            <code class="literal">initialize</code> actions (but these actions may not be
            combined with other actions).
         </p><p>
            If you have configured any Cedar Backup extensions, then the
            actions associated with those extensions may also be specified on
            the command line.  If you specify any other actions along with an
            extended action, the actions will be executed in a sensible order
            per configuration.  The <code class="literal">all</code> action never
            executes extended actions, however.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-commandline-sync"></a>The <span class="command"><strong>cback-amazons3-sync</strong></span> command</h2></div></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-sync-intro"></a>Introduction</h3></div></div></div><p>
            The <span class="command"><strong>cback-amazons3-sync</strong></span> tool is used for
            synchronizing entire directories of files up to an Amazon S3 cloud
            storage bucket, outside of the normal Cedar Backup process.  
         </p><p>
            This might be a good option for some types of data, as long as you
            understand the limitations around retrieving previous versions of
            objects that get modified or deleted as part of a sync.  S3 does
            support versioning, but it won't be quite as easy to get at those
            previous versions as with an explicit incremental backup like
            <span class="command"><strong>cback</strong></span> provides.  Cedar Backup does not provide
            any tooling that would help you retrieve previous versions.
         </p><p>
            The underlying functionality relies on the 
            <a class="ulink" href="http://aws.amazon.com/documentation/cli/" target="_top">AWS CLI</a> toolset.
            Before you use this extension, you need to set up your Amazon S3
            account and configure AWS CLI as detailed in Amazons's 
            <a class="ulink" href="http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-set-up.html" target="_top">setup guide</a>.
            The <span class="command"><strong>aws</strong></span> command will be executed as the same user that
            is executing the <span class="command"><strong>cback-amazons3-sync</strong></span> command, so
            make sure you configure it as the proper user.  (This is different
            than the amazons3 extension, which is designed to execute as root
            and switches over to the configured backup user to execute AWS CLI
            commands.)
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-sync-syntax"></a>Syntax</h3></div></div></div><p>
            The <span class="command"><strong>cback-amazons3-sync</strong></span> command has the following syntax:
         </p><pre class="screen">
 Usage: cback-amazons3-sync [switches] sourceDir s3bucketUrl

 Cedar Backup Amazon S3 sync tool.

 This Cedar Backup utility synchronizes a local directory to an Amazon S3
 bucket.  After the sync is complete, a validation step is taken.  An
 error is reported if the contents of the bucket do not match the
 source directory, or if the indicated size for any file differs.
 This tool is a wrapper over the AWS CLI command-line tool.

 The following arguments are required:

   sourceDir            The local source directory on disk (must exist)
   s3BucketUrl          The URL to the target Amazon S3 bucket

 The following switches are accepted:

   -h, --help           Display this usage/help listing
   -V, --version        Display version information
   -b, --verbose        Print verbose output as well as logging to disk
   -q, --quiet          Run quietly (display no output to the screen)
   -l, --logfile        Path to logfile (default: /var/log/cback.log)
   -o, --owner          Logfile ownership, user:group (default: root:adm)
   -m, --mode           Octal logfile permissions mode (default: 640)
   -O, --output         Record some sub-command (i.e. aws) output to the log
   -d, --debug          Write debugging information to the log (implies --output)
   -s, --stack          Dump Python stack trace instead of swallowing exceptions
   -D, --diagnostics    Print runtime diagnostics to the screen and exit
   -v, --verifyOnly     Only verify the S3 bucket contents, do not make changes
   -w, --ignoreWarnings Ignore warnings about problematic filename encodings

 Typical usage would be something like:

   cback-amazons3-sync /home/myuser s3://example.com-backup/myuser

 This will sync the contents of /home/myuser into the indicated bucket.
         </pre></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-sync-options"></a>Switches</h3></div></div></div><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="option">-h</code>, <code class="option">--help</code></span></dt><dd><p>Display usage/help listing.</p></dd><dt><span class="term"><code class="option">-V</code>, <code class="option">--version</code></span></dt><dd><p>Display version information.</p></dd><dt><span class="term"><code class="option">-b</code>, <code class="option">--verbose</code></span></dt><dd><p>
                     Print verbose output to the screen as well writing to the
                     logfile.  When this option is enabled, most information
                     that would normally be written to the logfile will also be
                     written to the screen.
                  </p></dd><dt><span class="term"><code class="option">-q</code>, <code class="option">--quiet</code></span></dt><dd><p>Run quietly (display no output to the screen).</p></dd><dt><span class="term"><code class="option">-l</code>, <code class="option">--logfile</code></span></dt><dd><p>
                     Specify the path to an alternate logfile.  The default
                     logfile file is <code class="filename">/var/log/cback.log</code>.
                  </p></dd><dt><span class="term"><code class="option">-o</code>, <code class="option">--owner</code></span></dt><dd><p>
                    Specify the ownership of the logfile, in the form
                    <code class="literal">user:group</code>.  The default ownership is
                    <code class="literal">root:adm</code>, to match the Debian standard
                    for most logfiles. This value will only be used when
                    creating a new logfile.  If the logfile already exists when
                    the <span class="command"><strong>cback-amazons3-sync</strong></span> command is
                    executed, it will retain its existing ownership and mode.
                    Only user and group names may be used, not numeric uid and
                    gid values.
                  </p></dd><dt><span class="term"><code class="option">-m</code>, <code class="option">--mode</code></span></dt><dd><p>
                    Specify  the  permissions  for  the logfile, using the
                    numeric mode as in chmod(1).  The default mode is
                    <code class="literal">0640</code> (<code class="literal">-rw-r-----</code>).
                    This value will only be used when creating a new logfile.
                    If the logfile already exists when the
                    <span class="command"><strong>cback-amazons3-sync</strong></span> command is executed,
                    it will retain its existing ownership and mode.
                  </p></dd><dt><span class="term"><code class="option">-O</code>, <code class="option">--output</code></span></dt><dd><p>
                     Record some sub-command output to the logfile.  When this
                     option is enabled, all output from system commands will be
                     logged.  This might be useful for debugging or just for
                     reference.  
                  </p></dd><dt><span class="term"><code class="option">-d</code>, <code class="option">--debug</code></span></dt><dd><p>
                     Write debugging information to the logfile.  This option
                     produces a high volume of output, and would generally only
                     be needed when debugging a problem.  This option implies
                     the <code class="option">--output</code> option, as well.
                  </p></dd><dt><span class="term"><code class="option">-s</code>, <code class="option">--stack</code></span></dt><dd><p>
                     Dump a Python stack trace instead of swallowing
                     exceptions.  This forces Cedar Backup to dump the entire
                     Python stack trace associated with an error, rather than
                     just propagating last message it received back up to the
                     user interface.  Under some circumstances, this is useful
                     information to include along with a bug report.
                  </p></dd><dt><span class="term"><code class="option">-D</code>, <code class="option">--diagnostics</code></span></dt><dd><p>
                     Display runtime diagnostic information and then exit.
                     This diagnostic information is often useful when filing a
                     bug report.
                  </p></dd><dt><span class="term"><code class="option">-v</code>, <code class="option">--verifyOnly</code></span></dt><dd><p>
                     Only verify the S3 bucket contents against the directory
                     on disk.  Do not make any changes to the S3 bucket or
                     transfer any files.  This is intended as a quick check
                     to see whether the sync is up-to-date.
                  </p><p>
                     Although no files are transferred, the tool will still
                     execute the source filename encoding check, discussed
                     below along with <code class="option">--ignoreWarnings</code>.
                  </p></dd><dt><span class="term"><code class="option">-w</code>, <code class="option">--ignoreWarnings</code></span></dt><dd><p>
                     The AWS CLI S3 sync process is very picky about filename
                     encoding.  Files that the Linux filesystem handles with no
                     problems can cause problems in S3 if the filename cannot be
                     encoded properly in your configured locale.  As of this
                     writing, filenames like this will cause the sync process
                     to abort without transferring all files as expected.  
                  </p><p>
                     To avoid confusion, the <span class="command"><strong>cback-amazons3-sync</strong></span> 
                     tries to guess which files in the source directory will
                     cause problems, and refuses to execute the AWS CLI S3 sync if
                     any problematic files exist.  If you'd rather proceed
                     anyway, use <code class="option">--ignoreWarnings</code>.
                  </p><p>
                     If problematic files are found, then you have basically
                     two options: either correct your locale (i.e. if you have
                     set <code class="literal">LANG=C</code>) or rename the file so it
                     can be encoded properly in your locale. The error messages
                     will tell you the expected encoding (from your locale) and
                     the actual detected encoding for the filename.
                  </p></dd></dl></div></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-commandline-cbackspan"></a>The <span class="command"><strong>cback-span</strong></span> command</h2></div></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cbackspan-intro"></a>Introduction</h3></div></div></div><p>
            Cedar Backup was designed &#8212; and is still primarily focused
            &#8212; around weekly backups to a single CD or DVD.  Most users
            who back up more data than fits on a single disc seem to stop their
            backup process at the stage step, using Cedar Backup as an
            easy way to collect data.  
         </p><p>
            However, some users have expressed a need to write these large
            kinds of backups to disc &#8212; if not every day, then at least
            occassionally.  The <span class="command"><strong>cback-span</strong></span> tool was written
            to meet those needs.  If you have staged more data than fits on a
            single CD or DVD, you can use <span class="command"><strong>cback-span</strong></span> to
            split that data between multiple discs.
         </p><p>
            <span class="command"><strong>cback-span</strong></span> is not a general-purpose
            disc-splitting tool.  It is a specialized program that requires
            Cedar Backup configuration to run.  All it can do is read Cedar
            Backup configuration, find any staging directories that have not
            yet been written to disc, and split the files in those directories
            between discs.
         </p><p>
            <span class="command"><strong>cback-span</strong></span> accepts many of the same command-line
            options as <span class="command"><strong>cback</strong></span>, but <span class="emphasis"><em>must</em></span>
            be run interactively.  It cannot be run from cron.  This is
            intentional.  It is intended to be a useful tool, not a new part of
            the backup process (that is the purpose of an extension).
         </p><p>
            In order to use <span class="command"><strong>cback-span</strong></span>, you must configure
            your backup such that the largest individual backup file can fit on
            a single disc.  <span class="emphasis"><em>The command will not split a single file
            onto more than one disc.</em></span>  All it can do is split large
            directories onto multiple discs.  Files in those directories will
            be arbitrarily split up so that space is utilized most efficiently.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cbackspan-syntax"></a>Syntax</h3></div></div></div><p>
            The <span class="command"><strong>cback-span</strong></span> command has the following syntax:
         </p><pre class="screen">
 Usage: cback-span [switches]

 Cedar Backup 'span' tool.

 This Cedar Backup utility spans staged data between multiple discs.
 It is a utility, not an extension, and requires user interaction.

 The following switches are accepted, mostly to set up underlying
 Cedar Backup functionality:

   -h, --help     Display this usage/help listing
   -V, --version  Display version information
   -b, --verbose  Print verbose output as well as logging to disk
   -c, --config   Path to config file (default: /etc/cback.conf)
   -l, --logfile  Path to logfile (default: /var/log/cback.log)
   -o, --owner    Logfile ownership, user:group (default: root:adm)
   -m, --mode     Octal logfile permissions mode (default: 640)
   -O, --output   Record some sub-command (i.e. cdrecord) output to the log
   -d, --debug    Write debugging information to the log (implies --output)
   -s, --stack    Dump a Python stack trace instead of swallowing exceptions
         </pre></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cbackspan-options"></a>Switches</h3></div></div></div><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="option">-h</code>, <code class="option">--help</code></span></dt><dd><p>Display usage/help listing.</p></dd><dt><span class="term"><code class="option">-V</code>, <code class="option">--version</code></span></dt><dd><p>Display version information.</p></dd><dt><span class="term"><code class="option">-b</code>, <code class="option">--verbose</code></span></dt><dd><p>
                     Print verbose output to the screen as well writing to the
                     logfile.  When this option is enabled, most information
                     that would normally be written to the logfile will also be
                     written to the screen.
                  </p></dd><dt><span class="term"><code class="option">-c</code>, <code class="option">--config</code></span></dt><dd><p>
                     Specify the path to an alternate configuration file.
                     The default configuration file is <code class="filename">/etc/cback.conf</code>.
                  </p></dd><dt><span class="term"><code class="option">-l</code>, <code class="option">--logfile</code></span></dt><dd><p>
                     Specify the path to an alternate logfile.  The default
                     logfile file is <code class="filename">/var/log/cback.log</code>.
                  </p></dd><dt><span class="term"><code class="option">-o</code>, <code class="option">--owner</code></span></dt><dd><p>
                    Specify the ownership of the logfile, in the form
                    <code class="literal">user:group</code>.  The default ownership is
                    <code class="literal">root:adm</code>, to match the Debian standard
                    for most logfiles. This value will only be used when
                    creating a new logfile.  If the logfile already exists when
                    the <span class="command"><strong>cback</strong></span> command is executed, it will
                    retain its existing ownership and mode. Only user and group
                    names may be used, not numeric uid and gid values.
                  </p></dd><dt><span class="term"><code class="option">-m</code>, <code class="option">--mode</code></span></dt><dd><p>
                    Specify  the  permissions  for  the logfile, using the
                    numeric mode as in chmod(1).  The default mode is
                    <code class="literal">0640</code> (<code class="literal">-rw-r-----</code>).
                    This value will only be used when creating a new logfile.
                    If the logfile already exists when the
                    <span class="command"><strong>cback</strong></span> command is executed, it will retain
                    its existing ownership and mode.
                  </p></dd><dt><span class="term"><code class="option">-O</code>, <code class="option">--output</code></span></dt><dd><p>
                     Record some sub-command output to the logfile.  When this
                     option is enabled, all output from system commands will be
                     logged.  This might be useful for debugging or just for
                     reference.  Cedar Backup uses system commands mostly for
                     dealing with the CD/DVD recorder and its media.
                  </p></dd><dt><span class="term"><code class="option">-d</code>, <code class="option">--debug</code></span></dt><dd><p>
                     Write debugging information to the logfile.  This option
                     produces a high volume of output, and would generally only
                     be needed when debugging a problem.  This option implies
                     the <code class="option">--output</code> option, as well.
                  </p></dd><dt><span class="term"><code class="option">-s</code>, <code class="option">--stack</code></span></dt><dd><p>
                     Dump a Python stack trace instead of swallowing
                     exceptions.  This forces Cedar Backup to dump the entire
                     Python stack trace associated with an error, rather than
                     just propagating last message it received back up to the
                     user interface.  Under some circumstances, this is useful
                     information to include along with a bug report.
                  </p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cbackspan-using"></a>Using <span class="command"><strong>cback-span</strong></span></h3></div></div></div><p>
            As discussed above, the <span class="command"><strong>cback-span</strong></span> is an
            interactive command.  It cannot be run from cron.  
         </p><p>
            You can typically use the default answer for most questions.
            The only two questions that you may not want the default answer
            for are the fit algorithm and the cushion percentage.  
         </p><p>
            The cushion percentage is used by <span class="command"><strong>cback-span</strong></span> to
            determine what capacity to shoot for when splitting up your staging
            directories.  A 650 MB disc does not fit fully 650 MB of data.
            It's usually more like 627 MB of data.  The cushion percentage
            tells <span class="command"><strong>cback-span</strong></span> how much overhead to reserve
            for the filesystem.  The default of 4% is usually OK, but if you
            have problems you may need to increase it slightly.
         </p><p>
            The fit algorithm tells <span class="command"><strong>cback-span</strong></span> how it
            should determine which items should be placed on each disc.  
            If you don't like the result from one algorithm, you can reject
            that solution and choose a different algorithm.
         </p><p>
            The four available fit algorithms are:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term">worst</span></dt><dd><p>
                     The <em class="firstterm">worst-fit</em> algorithm.
                  </p><p>
                     The worst-fit algorithm proceeds through a sorted list
                     of items (sorted from smallest to largest) until running
                     out of items or meeting capacity exactly.  If capacity is
                     exceeded, the item that caused capacity to be exceeded is
                     thrown away and the next one is tried.  The algorithm
                     effectively includes the maximum number of items possible
                     in its search for optimal capacity utilization.  It tends
                     to be somewhat slower than either the best-fit or
                     alternate-fit algorithm, probably because on average it
                     has to look at more items before completing.
                  </p></dd><dt><span class="term">best</span></dt><dd><p>
                     The <em class="firstterm">best-fit</em> algorithm.
                  </p><p>
                     The best-fit algorithm proceeds through a sorted list of
                     items (sorted from largest to smallest) until running out
                     of items or meeting capacity exactly.  If capacity is
                     exceeded, the item that caused capacity to be exceeded is
                     thrown away and the next one is tried.  The algorithm
                     effectively includes the minimum number of items possible
                     in its search for optimal capacity utilization.  For large
                     lists of mixed-size items, it's not unusual to see the
                     algorithm achieve 100% capacity utilization by including
                     fewer than 1% of the items.  Probably because it often has
                     to look at fewer of the items before completing, it tends
                     to be a little faster than the worst-fit or alternate-fit
                     algorithms.
                  </p></dd><dt><span class="term">first</span></dt><dd><p>
                     The <em class="firstterm">first-fit</em> algorithm. 
                  </p><p>
                     The first-fit algorithm proceeds through an unsorted list
                     of items until running out of items or meeting capacity
                     exactly.  If capacity is exceeded, the item that caused
                     capacity to be exceeded is thrown away and the next one is
                     tried.  This algorithm generally performs more poorly than
                     the other algorithms both in terms of capacity utilization
                     and item utilization, but can be as much as an order of
                     magnitude faster on large lists of items because it
                     doesn't require any sorting.
                  </p></dd><dt><span class="term">alternate</span></dt><dd><p>
                     A hybrid algorithm that I call
                     <em class="firstterm">alternate-fit</em>.  
                  </p><p>
                     This algorithm tries to balance small and large items to
                     achieve better end-of-disk performance.  Instead of just
                     working one direction through a list, it alternately works
                     from the start and end of a sorted list (sorted from
                     smallest to largest), throwing away any item which causes
                     capacity to be exceeded.  The algorithm tends to be slower
                     than the best-fit and first-fit algorithms, and slightly
                     faster than the worst-fit algorithm, probably because of
                     the number of items it considers on average before
                     completing.  It often achieves slightly better capacity
                     utilization than the worst-fit algorithm, while including
                     slightly fewer items.
                  </p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-commandline-cbackspan-sample"></a>Sample run</h3></div></div></div><p>
            Below is a log showing a sample <span class="command"><strong>cback-span</strong></span> run.
         </p><pre class="screen">
================================================
           Cedar Backup 'span' tool
================================================

This the Cedar Backup span tool.  It is used to split up staging
data when that staging data does not fit onto a single disc.

This utility operates using Cedar Backup configuration.  Configuration
specifies which staging directory to look at and which writer device
and media type to use.

Continue? [Y/n]: 
===

Cedar Backup store configuration looks like this:

   Source Directory...: /tmp/staging
   Media Type.........: cdrw-74
   Device Type........: cdwriter
   Device Path........: /dev/cdrom
   Device SCSI ID.....: None
   Drive Speed........: None
   Check Data Flag....: True
   No Eject Flag......: False

Is this OK? [Y/n]: 
===

Please wait, indexing the source directory (this may take a while)...
===

The following daily staging directories have not yet been written to disc:

   /tmp/staging/2007/02/07
   /tmp/staging/2007/02/08
   /tmp/staging/2007/02/09
   /tmp/staging/2007/02/10
   /tmp/staging/2007/02/11
   /tmp/staging/2007/02/12
   /tmp/staging/2007/02/13
   /tmp/staging/2007/02/14

The total size of the data in these directories is 1.00 GB.

Continue? [Y/n]: 
===

Based on configuration, the capacity of your media is 650.00 MB.

Since estimates are not perfect and there is some uncertainly in
media capacity calculations, it is good to have a "cushion",
a percentage of capacity to set aside.  The cushion reduces the
capacity of your media, so a 1.5% cushion leaves 98.5% remaining.

What cushion percentage? [4.00]: 
===

The real capacity, taking into account the 4.00% cushion, is 627.25 MB.
It will take at least 2 disc(s) to store your 1.00 GB of data.

Continue? [Y/n]: 
===

Which algorithm do you want to use to span your data across
multiple discs?

The following algorithms are available:

   first....: The "first-fit" algorithm
   best.....: The "best-fit" algorithm
   worst....: The "worst-fit" algorithm
   alternate: The "alternate-fit" algorithm

If you don't like the results you will have a chance to try a
different one later.

Which algorithm? [worst]: 
===

Please wait, generating file lists (this may take a while)...
===

Using the "worst-fit" algorithm, Cedar Backup can split your data
into 2 discs.

Disc 1: 246 files, 615.97 MB, 98.20% utilization
Disc 2: 8 files, 412.96 MB, 65.84% utilization

Accept this solution? [Y/n]: n
===

Which algorithm do you want to use to span your data across
multiple discs?

The following algorithms are available:

   first....: The "first-fit" algorithm
   best.....: The "best-fit" algorithm
   worst....: The "worst-fit" algorithm
   alternate: The "alternate-fit" algorithm

If you don't like the results you will have a chance to try a
different one later.

Which algorithm? [worst]: alternate
===

Please wait, generating file lists (this may take a while)...
===

Using the "alternate-fit" algorithm, Cedar Backup can split your data
into 2 discs.

Disc 1: 73 files, 627.25 MB, 100.00% utilization
Disc 2: 181 files, 401.68 MB, 64.04% utilization

Accept this solution? [Y/n]: y
===

Please place the first disc in your backup device.
Press return when ready.
===

Initializing image...
Writing image to disc...
         </pre></div></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.idp59221312" class="footnote"><p><a href="#idp59221312" class="para"><sup class="para">[18] </sup></a>Some users find this surprising,
            because extensions are configured with sequence numbers.  I did it
            this way because I felt that running extensions as part of the all
            action would sometimes result in <span class="quote">&#8220;<span class="quote">surprising</span>&#8221;</span>
            behavior.  Better to be definitive than confusing.</p></div></div></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-config"></a>Chapter 5. Configuration</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-config-overview">Overview</a></span></dt><dt><span class="sect1"><a href="#cedar-config-configfile">Configuration File Format</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-config-configfile-sample">Sample Configuration File</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-reference">Reference Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-options">Options Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-peers">Peers Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-collect">Collect Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-stage">Stage Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-store">Store Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-purge">Purge Configuration</a></span></dt><dt><span class="sect2"><a href="#cedar-config-configfile-extensions">Extensions Configuration</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-poolofone">Setting up a Pool of One</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60200128">Step 1: Decide when you will run your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60205360">Step 2: Make sure email works.</a></span></dt><dt><span class="sect2"><a href="#idp60208656">Step 3: Configure your writer device.</a></span></dt><dt><span class="sect2"><a href="#idp60216656">Step 4: Configure your backup user.</a></span></dt><dt><span class="sect2"><a href="#idp60221824">Step 5: Create your backup tree.</a></span></dt><dt><span class="sect2"><a href="#idp60231984">Step 6: Create the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60238336">Step 7: Validate the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60242016">Step 8: Test your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60248384">Step 9: Modify the backup cron jobs.</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-client">Setting up a Client Peer Node</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60264304">Step 1: Decide when you will run your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60269536">Step 2: Make sure email works.</a></span></dt><dt><span class="sect2"><a href="#idp59575344">Step 3: Configure the master in your backup pool.</a></span></dt><dt><span class="sect2"><a href="#idp59579776">Step 4: Configure your backup user.</a></span></dt><dt><span class="sect2"><a href="#idp60301904">Step 5: Create your backup tree.</a></span></dt><dt><span class="sect2"><a href="#idp60310656">Step 6: Create the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60317008">Step 7: Validate the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60321184">Step 8: Test your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60324128">Step 9: Modify the backup cron jobs.</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-master">Setting up a Master Peer Node</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60341344">Step 1: Decide when you will run your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60347152">Step 2: Make sure email works.</a></span></dt><dt><span class="sect2"><a href="#idp60350016">Step 3: Configure your writer device.</a></span></dt><dt><span class="sect2"><a href="#idp60358016">Step 4: Configure your backup user.</a></span></dt><dt><span class="sect2"><a href="#idp60372064">Step 5: Create your backup tree.</a></span></dt><dt><span class="sect2"><a href="#idp60381504">Step 6: Create the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60390640">Step 7: Validate the Cedar Backup configuration file.</a></span></dt><dt><span class="sect2"><a href="#idp60394880">Step 8: Test connectivity to client machines.</a></span></dt><dt><span class="sect2"><a href="#idp60399696">Step 9: Test your backup.</a></span></dt><dt><span class="sect2"><a href="#idp60407776">Step 10: Modify the backup cron jobs.</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-writer">Configuring your Writer Device</a></span></dt><dd><dl><dt><span class="sect2"><a href="#idp60419328">Device Types</a></span></dt><dt><span class="sect2"><a href="#idp60421696">Devices identified by by device name</a></span></dt><dt><span class="sect2"><a href="#idp60422768">Devices identified by SCSI id</a></span></dt><dt><span class="sect2"><a href="#idp60431648">Linux Notes</a></span></dt><dt><span class="sect2"><a href="#idp60437936">Finding your Linux CD Writer</a></span></dt><dt><span class="sect2"><a href="#idp60448992">Mac OS X Notes</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-config-blanking">Optimized Blanking Stategy</a></span></dt></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-overview"></a>Overview</h2></div></div></div><p>
         Configuring Cedar Backup is unfortunately
         somewhat complicated.  The good news is that once you get through the
         initial configuration process, you'll hardly ever have to change
         anything. Even better, the most typical changes (i.e. adding
         and removing directories from a backup) are easy.
      </p><p>
         First, familiarize yourself with the concepts in 
         <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>.  In particular, be sure that you understand
         the differences between a master and a client.  (If you only have one
         machine, then your machine will act as both a master and a client,
         and we'll refer to your setup as a <em class="firstterm">pool of one</em>.)
         Then, install Cedar Backup per the instructions in 
         <a class="xref" href="#cedar-install" title="Chapter 3. Installation">Chapter 3, <i>Installation</i></a>.
      </p><p>
         Once everything has been installed, you are ready to begin configuring
         Cedar Backup.  Look over <a class="xref" href="#cedar-commandline-cback" title="The cback command">the section called &#8220;The <span class="command"><strong>cback</strong></span> command&#8221;</a> (in
         <a class="xref" href="#cedar-commandline" title="Chapter 4. Command Line Tools">Chapter 4, <i>Command Line Tools</i></a>) to become familiar with the
         command line interface.  Then, look over <a class="xref" href="#cedar-config-configfile" title="Configuration File Format">the section called &#8220;Configuration File Format&#8221;</a> (below) and create a configuration
         file for each peer in your backup pool.  To start with, create a very
         simple configuration file, then expand it later.  Decide now whether
         you will store the configuration file in the standard place
         (<code class="filename">/etc/cback.conf</code>) or in some other location.
      </p><p>
         After you have all of the configuration files in place, configure each
         of your machines, following the instructions in the appropriate
         section below (for master, client or pool of one).  Since the master
         and client(s) must communicate over the network, you won't be able to
         fully configure the master without configuring each client and
         vice-versa.  The instructions are clear on what needs to be done.
      </p><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Which Platform?</b></p></div></div></div><p>
            Cedar Backup has been designed for use on all UNIX-like systems.
            However, since it was developed on a Debian GNU/Linux system, and
            because I am a Debian developer, the packaging is prettier and the
            setup is somewhat simpler on a Debian system than on a system where
            you install from source.  
         </p><p>
            The configuration instructions below have been generalized so they
            should work well regardless of what platform you are running (i.e.
            RedHat, Gentoo, FreeBSD, etc.).  If instructions vary for a
            particular platform, you will find a note related to that
            platform.
         </p><p>
            I am always open to adding more platform-specific hints and notes,
            so write me if you find problems with these instructions.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-configfile"></a>Configuration File Format</h2></div></div></div><p>
         Cedar Backup is configured through an XML <a href="#ftn.idp59626560" class="footnote" name="idp59626560"><sup class="footnote">[19]</sup></a> configuration file,
         usually called <code class="filename">/etc/cback.conf</code>.  The configuration
         file contains the following sections: <em class="firstterm">reference</em>,
         <em class="firstterm">options</em>, <em class="firstterm">collect</em>,
         <em class="firstterm">stage</em>, <em class="firstterm">store</em>,
         <em class="firstterm">purge</em> and <em class="firstterm">extensions</em>.
      </p><p>
         All configuration files must contain the two general configuration
         sections, the reference section and the options section.  Besides
         that, administrators need only configure actions they intend to use.
         For instance, on a client machine, administrators will generally only
         configure the collect and purge sections, while on a master machine
         they will have to configure all four action-related sections.
         <a href="#ftn.idp59631744" class="footnote" name="idp59631744"><sup class="footnote">[20]</sup></a> The extensions section is
         always optional and can be omitted unless extensions are in use.
      </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
            Even though the Mac OS X (darwin) filesystem is
            <span class="emphasis"><em>not</em></span> case-sensitive, Cedar Backup configuration
            <span class="emphasis"><em>is</em></span> generally case-sensitive on that platform,
            just like on all other platforms.  For instance, even though the
            files <span class="quote">&#8220;<span class="quote">Ken</span>&#8221;</span> and <span class="quote">&#8220;<span class="quote">ken</span>&#8221;</span> might be the same on the
            Mac OS X filesystem, an exclusion in Cedar Backup configuration for
            <span class="quote">&#8220;<span class="quote">ken</span>&#8221;</span> will only match the file if it is actually on
            the filesystem with a lower-case <span class="quote">&#8220;<span class="quote">k</span>&#8221;</span> as its first
            letter.  This won't surprise the typical UNIX user, but might
            surprise someone who's gotten into the <span class="quote">&#8220;<span class="quote">Mac Mindset</span>&#8221;</span>.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-sample"></a>Sample Configuration File</h3></div></div></div><p>
            Both the Python source distribution and the Debian package come with a
            sample configuration file.  The Debian package includes its sample in
            <code class="filename">/usr/share/doc/cedar-backup2/examples/cback.conf.sample</code>.
         </p><p>
            This is a sample configuration file similar to the one provided in the
            source package.  Documentation below provides more information about
            each of the individual configuration sections.
         </p><pre class="programlisting">
&lt;?xml version="1.0"?&gt;
&lt;cb_config&gt;
   &lt;reference&gt;
      &lt;author&gt;Kenneth J. Pronovici&lt;/author&gt;
      &lt;revision&gt;1.3&lt;/revision&gt;
      &lt;description&gt;Sample&lt;/description&gt;
   &lt;/reference&gt;
   &lt;options&gt;
      &lt;starting_day&gt;tuesday&lt;/starting_day&gt;
      &lt;working_dir&gt;/opt/backup/tmp&lt;/working_dir&gt;
      &lt;backup_user&gt;backup&lt;/backup_user&gt;
      &lt;backup_group&gt;group&lt;/backup_group&gt;
      &lt;rcp_command&gt;/usr/bin/scp -B&lt;/rcp_command&gt;
   &lt;/options&gt;
   &lt;peers&gt;
      &lt;peer&gt;
         &lt;name&gt;debian&lt;/name&gt;
         &lt;type&gt;local&lt;/type&gt;
         &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
      &lt;/peer&gt;
   &lt;/peers&gt;
   &lt;collect&gt;
      &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
      &lt;collect_mode&gt;daily&lt;/collect_mode&gt;
      &lt;archive_mode&gt;targz&lt;/archive_mode&gt;
      &lt;ignore_file&gt;.cbignore&lt;/ignore_file&gt;
      &lt;dir&gt;
         &lt;abs_path&gt;/etc&lt;/abs_path&gt;
         &lt;collect_mode&gt;incr&lt;/collect_mode&gt;
      &lt;/dir&gt;
      &lt;file&gt;
         &lt;abs_path&gt;/home/root/.profile&lt;/abs_path&gt;
         &lt;collect_mode&gt;weekly&lt;/collect_mode&gt;
      &lt;/file&gt;
   &lt;/collect&gt;
   &lt;stage&gt;
      &lt;staging_dir&gt;/opt/backup/staging&lt;/staging_dir&gt;
   &lt;/stage&gt;
   &lt;store&gt;
      &lt;source_dir&gt;/opt/backup/staging&lt;/source_dir&gt;
      &lt;media_type&gt;cdrw-74&lt;/media_type&gt;
      &lt;device_type&gt;cdwriter&lt;/device_type&gt;
      &lt;target_device&gt;/dev/cdrw&lt;/target_device&gt;
      &lt;target_scsi_id&gt;0,0,0&lt;/target_scsi_id&gt;
      &lt;drive_speed&gt;4&lt;/drive_speed&gt;
      &lt;check_data&gt;Y&lt;/check_data&gt;
      &lt;check_media&gt;Y&lt;/check_media&gt;
      &lt;warn_midnite&gt;Y&lt;/warn_midnite&gt;
   &lt;/store&gt;
   &lt;purge&gt;
      &lt;dir&gt;
         &lt;abs_path&gt;/opt/backup/stage&lt;/abs_path&gt;
         &lt;retain_days&gt;7&lt;/retain_days&gt;
      &lt;/dir&gt;
      &lt;dir&gt;
         &lt;abs_path&gt;/opt/backup/collect&lt;/abs_path&gt;
         &lt;retain_days&gt;0&lt;/retain_days&gt;
      &lt;/dir&gt;
   &lt;/purge&gt;
&lt;/cb_config&gt;
         </pre></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-reference"></a>Reference Configuration</h3></div></div></div><p>
            The reference configuration section contains free-text elements
            that exist only for reference..  The section itself is required,
            but the individual elements may be left blank if desired.
         </p><p>
            This is an example reference configuration section:
         </p><pre class="programlisting">
&lt;reference&gt;
   &lt;author&gt;Kenneth J. Pronovici&lt;/author&gt;
   &lt;revision&gt;Revision 1.3&lt;/revision&gt;
   &lt;description&gt;Sample&lt;/description&gt;
   &lt;generator&gt;Yet to be Written Config Tool (tm)&lt;/description&gt;
&lt;/reference&gt;
         </pre><p>
            The following elements are part of the reference configuration section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">author</code></span></dt><dd><p>Author of the configuration file.</p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> None
                  </p></dd><dt><span class="term"><code class="literal">revision</code></span></dt><dd><p>Revision of the configuration file.</p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> None
                  </p></dd><dt><span class="term"><code class="literal">description</code></span></dt><dd><p>Description of the configuration file.</p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> None
                  </p></dd><dt><span class="term"><code class="literal">generator</code></span></dt><dd><p>Tool that generated the configuration file, if any.</p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> None
                  </p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-options"></a>Options Configuration</h3></div></div></div><p>
            The options configuration section contains configuration options
            that are not specific to any one action.  
         </p><p>
            This is an example options configuration section:
         </p><pre class="programlisting">
&lt;options&gt;
   &lt;starting_day&gt;tuesday&lt;/starting_day&gt;
   &lt;working_dir&gt;/opt/backup/tmp&lt;/working_dir&gt;
   &lt;backup_user&gt;backup&lt;/backup_user&gt;
   &lt;backup_group&gt;backup&lt;/backup_group&gt;
   &lt;rcp_command&gt;/usr/bin/scp -B&lt;/rcp_command&gt;
   &lt;rsh_command&gt;/usr/bin/ssh&lt;/rsh_command&gt;
   &lt;cback_command&gt;/usr/bin/cback&lt;/cback_command&gt;
   &lt;managed_actions&gt;collect, purge&lt;/managed_actions&gt;
   &lt;override&gt;
      &lt;command&gt;cdrecord&lt;/command&gt;
      &lt;abs_path&gt;/opt/local/bin/cdrecord&lt;/abs_path&gt;
   &lt;/override&gt;
   &lt;override&gt;
      &lt;command&gt;mkisofs&lt;/command&gt;
      &lt;abs_path&gt;/opt/local/bin/mkisofs&lt;/abs_path&gt;
   &lt;/override&gt;
   &lt;pre_action_hook&gt;
      &lt;action&gt;collect&lt;/action&gt;
      &lt;command&gt;echo "I AM A PRE-ACTION HOOK RELATED TO COLLECT"&lt;/command&gt;
   &lt;/pre_action_hook&gt;
   &lt;post_action_hook&gt;
      &lt;action&gt;collect&lt;/action&gt;
      &lt;command&gt;echo "I AM A POST-ACTION HOOK RELATED TO COLLECT"&lt;/command&gt;
   &lt;/post_action_hook&gt;
&lt;/options&gt;
         </pre><p>
            The following elements are part of the options configuration section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">starting_day</code></span></dt><dd><p>Day that starts the week.</p><p>
                     Cedar Backup is built around the idea of weekly backups.
                     The starting day of week is the day that media will be
                     rebuilt from scratch and that incremental backup
                     information will be cleared.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be a day of the
                     week in English, i.e. <code class="literal">monday</code>,
                     <code class="literal">tuesday</code>, etc.  The validation is
                     case-sensitive.
                  </p></dd><dt><span class="term"><code class="literal">working_dir</code></span></dt><dd><p>Working (temporary) directory to use for backups.</p><p>
                     This directory is used for writing temporary files, such
                     as tar file or ISO filesystem images as they are being built.  It
                     is also used to store day-to-day information about
                     incremental backups.
                  </p><p>
                     The working directory should contain enough free space to
                     hold temporary tar files (on a client) or to build an ISO
                     filesystem image (on a master).
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path
                  </p></dd><dt><span class="term"><code class="literal">backup_user</code></span></dt><dd><p>Effective user that backups should run as.</p><p>
                     This user must exist on the machine which is being
                     configured and should not be root (although that
                     restriction is not enforced).
                  </p><p>
                     This value is also used as the default remote backup user
                     for remote peers.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                  </p></dd><dt><span class="term"><code class="literal">backup_group</code></span></dt><dd><p>Effective group that backups should run as.</p><p>
                     This group must exist on the machine which is being
                     configured, and should not be root or some other
                     <span class="quote">&#8220;<span class="quote">powerful</span>&#8221;</span> group (although that restriction
                     is not enforced).
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                  </p></dd><dt><span class="term"><code class="literal">rcp_command</code></span></dt><dd><p>Default rcp-compatible copy command for staging.</p><p>
                     The rcp command should be the exact command used for
                     remote copies, including any required options.  If you are
                     using <span class="command"><strong>scp</strong></span>, you should pass it the
                     <code class="option">-B</code> option, so <span class="command"><strong>scp</strong></span> will
                     not ask for any user input (which could hang the backup).
                     A common example is something like <span class="command"><strong>/usr/bin/scp
                     -B</strong></span>.
                  </p><p>
                     This value is used as the default value for all remote
                     peers.  Technically, this value is not needed by clients,
                     but we require it for all config files anyway.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                  </p></dd><dt><span class="term"><code class="literal">rsh_command</code></span></dt><dd><p>Default rsh-compatible command to use for remote shells.</p><p>
                     The rsh command should be the exact command used for
                     remote shells, including any required options.  
                  </p><p>
                     This value is used as the default value for all managed
                     clients.  It is optional, because it is only used when
                     executing actions on managed clients.  However, each
                     managed client must either be able to read the value from
                     options configuration or must set the value explicitly.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                  </p></dd><dt><span class="term"><code class="literal">cback_command</code></span></dt><dd><p>Default cback-compatible command to use on managed remote clients.</p><p>
                     The cback command should be the exact command used for for
                     executing <span class="command"><strong>cback</strong></span> on a remote managed
                     client, including any required command-line options.  Do
                     <span class="emphasis"><em>not</em></span> list any actions in the command
                     line, and do <span class="emphasis"><em>not</em></span> include the
                     <span class="command"><strong>--full</strong></span> command-line option.
                  </p><p>
                     This value is used as the default value for all managed
                     clients.  It is optional, because it is only used when
                     executing actions on managed clients.  However, each
                     managed client must either be able to read the value from
                     options configuration or must set the value explicitly.
                  </p><p>
                     Note: if this command-line is complicated, it is often
                     better to create a simple shell script on the remote host
                     to encapsulate all of the options.  Then, just reference
                     the shell script in configuration.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                  </p></dd><dt><span class="term"><code class="literal">managed_actions</code></span></dt><dd><p>Default set of actions that are managed on remote clients.</p><p>
                     This is a comma-separated list of actions that the master
                     will manage on behalf of remote clients.  Typically, it
                     would include only collect-like actions and purge.  
                  </p><p>
                     This value is used as the default value for all managed
                     clients.  It is optional, because it is only used when
                     executing actions on managed clients.  However, each
                     managed client must either be able to read the value from
                     options configuration or must set the value explicitly.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.  
                  </p></dd><dt><span class="term"><code class="literal">override</code></span></dt><dd><p>Command to override with a customized path.</p><p>
                     This is a subsection which contains a command to override
                     with a customized path.  This functionality would be used
                     if root's <code class="literal">$PATH</code> does not include a
                     particular required command, or if there is a need to use
                     a version of a command that is different than the one
                     listed on the <code class="literal">$PATH</code>.  Most users will
                     only use this section when directed to, in order to fix a
                     problem.
                  </p><p>
                     This section is optional, and can be repeated as many times
                     as necessary.
                  </p><p>
                     This subsection must contain the following two fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">command</code></span></dt><dd><p>
                              Name of the command to be overridden, i.e.
                              <span class="quote">&#8220;<span class="quote">cdrecord</span>&#8221;</span>. 
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a
                              non-empty string.
                           </p></dd><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                              The absolute path where the overridden command 
                              can be found.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an
                              absolute path.
                           </p></dd></dl></div></dd><dt><span class="term"><code class="literal">pre_action_hook</code></span></dt><dd><p>Hook configuring a command to be executed before an action.</p><p>
                     This is a subsection which configures a command to be
                     executed immediately before a named action.  It provides a
                     way for administrators to associate their own custom
                     functionality with standard Cedar Backup actions or with
                     arbitrary extensions.
                  </p><p>
                     This section is optional, and can be repeated as many times
                     as necessary.
                  </p><p>
                     This subsection must contain the following two fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">action</code></span></dt><dd><p>
                              Name of the Cedar Backup action that the hook is
                              associated with.   The action can be a standard
                              backup action (collect, stage, etc.) or can be an
                              extension action.  No validation is done to
                              ensure that the configured action actually
                              exists.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a
                              non-empty string.
                           </p></dd><dt><span class="term"><code class="literal">command</code></span></dt><dd><p>
                              Name of the command to be executed.  This item
                              can either specify the path to a shell script of
                              some sort (the recommended approach) or can include
                              a complete shell command.  
                           </p><p>
                              Note: if you choose to provide a complete shell
                              command rather than the path to a script, you
                              need to be aware of some limitations of Cedar
                              Backup's command-line parser.  You cannot use a
                              subshell (via the <code class="literal">`command`</code> or
                              <code class="literal">$(command)</code> syntaxes) or any
                              shell variable in your command line.
                              Additionally, the command-line parser only
                              recognizes the double-quote character
                              (<code class="literal">"</code>) to delimit groupings or
                              strings on the command-line.  The bottom line is,
                              you are probably best off writing a shell script
                              of some sort for anything more sophisticated than
                              very simple shell commands.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a
                              non-empty string.
                           </p></dd></dl></div></dd><dt><span class="term"><code class="literal">post_action_hook</code></span></dt><dd><p>Hook configuring a command to be executed after an action.</p><p>
                     This is a subsection which configures a command to be
                     executed immediately after a named action.  It provides a
                     way for administrators to associate their own custom
                     functionality with standard Cedar Backup actions or with
                     arbitrary extensions.
                  </p><p>
                     This section is optional, and can be repeated as many times
                     as necessary.
                  </p><p>
                     This subsection must contain the following two fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">action</code></span></dt><dd><p>
                              Name of the Cedar Backup action that the hook is
                              associated with.   The action can be a standard
                              backup action (collect, stage, etc.) or can be an
                              extension action.  No validation is done to
                              ensure that the configured action actually
                              exists.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a
                              non-empty string.
                           </p></dd><dt><span class="term"><code class="literal">command</code></span></dt><dd><p>
                              Name of the command to be executed.  This item
                              can either specify the path to a shell script of
                              some sort (the recommended approach) or can include
                              a complete shell command.  
                           </p><p>
                              Note: if you choose to provide a complete shell
                              command rather than the path to a script, you
                              need to be aware of some limitations of Cedar
                              Backup's command-line parser.  You cannot use a
                              subshell (via the <code class="literal">`command`</code> or
                              <code class="literal">$(command)</code> syntaxes) or any
                              shell variable in your command line.
                              Additionally, the command-line parser only
                              recognizes the double-quote character
                              (<code class="literal">"</code>) to delimit groupings or
                              strings on the command-line.  The bottom line is,
                              you are probably best off writing a shell script
                              of some sort for anything more sophisticated than
                              very simple shell commands.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a
                              non-empty string.
                           </p></dd></dl></div></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-peers"></a>Peers Configuration</h3></div></div></div><p>
            The peers configuration section contains a list of the peers
            managed by a master.  This section is only required on a master.
         </p><p>
            This is an example peers configuration section:
         </p><pre class="programlisting">
&lt;peers&gt;
   &lt;peer&gt;
      &lt;name&gt;machine1&lt;/name&gt;
      &lt;type&gt;local&lt;/type&gt;
      &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
   &lt;/peer&gt;
   &lt;peer&gt;
      &lt;name&gt;machine2&lt;/name&gt;
      &lt;type&gt;remote&lt;/type&gt;
      &lt;backup_user&gt;backup&lt;/backup_user&gt;
      &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
      &lt;ignore_failures&gt;all&lt;/ignore_failures&gt;
   &lt;/peer&gt;
   &lt;peer&gt;
      &lt;name&gt;machine3&lt;/name&gt;
      &lt;type&gt;remote&lt;/type&gt;
      &lt;managed&gt;Y&lt;/managed&gt;
      &lt;backup_user&gt;backup&lt;/backup_user&gt;
      &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
      &lt;rcp_command&gt;/usr/bin/scp&lt;/rcp_command&gt;
      &lt;rsh_command&gt;/usr/bin/ssh&lt;/rsh_command&gt;
      &lt;cback_command&gt;/usr/bin/cback&lt;/cback_command&gt;
      &lt;managed_actions&gt;collect, purge&lt;/managed_actions&gt;
   &lt;/peer&gt;
&lt;/peers&gt;
         </pre><p>
            The following elements are part of the peers configuration section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">peer</code> (local version)</span></dt><dd><p>Local client peer in a backup pool.</p><p>
                     This is a subsection which contains information about a
                     specific local client peer managed by a master.
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one remote or local peer must be
                     configured.
                  </p><p>
                     The local peer subsection must contain the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">name</code></span></dt><dd><p>Name of the peer, typically a valid hostname.</p><p>
                              For local peers, this value is only used for
                              reference.  However, it is good practice to list
                              the peer's hostname here, for consistency with
                              remote peers.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty,
                              and unique among all peers.
                           </p></dd><dt><span class="term"><code class="literal">type</code></span></dt><dd><p>Type of this peer.</p><p>
                              This value identifies the type of the peer.  For
                              a local peer, it must always be <code class="literal">local</code>.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be <code class="literal">local</code>.
                           </p></dd><dt><span class="term"><code class="literal">collect_dir</code></span></dt><dd><p>Collect directory to stage from for this peer.</p><p>
                              The master will copy all files in this directory
                              into the appropriate staging directory.  Since
                              this is a local peer, the directory is assumed to
                              be reachable via normal filesystem operations
                              (i.e. <span class="command"><strong>cp</strong></span>). 
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">ignore_failures</code></span></dt><dd><p>Ignore failure mode for this peer</p><p>
                              The ignore failure mode indicates whether
                              <span class="quote">&#8220;<span class="quote">not ready to be staged</span>&#8221;</span> errors
                              should be ignored for this peer.  This option is
                              intended to be used for peers that are up only
                              intermittently, to cut down on the number of
                              error emails received by the Cedar Backup
                              administrator.
                           </p><p>
                              The "none" mode means that all errors will be
                              reported.  This is the default behavior.  The
                              "all" mode means to ignore all failures.  The
                              "weekly" mode means to ignore failures for a
                              start-of-week or full backup.  The "daily" mode
                              means to ignore failures for any backup that is
                              not either a full backup or a start-of-week
                              backup.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> If set, must
                              be one of "none", "all", "daily", or "weekly".
                           </p></dd></dl></div></dd><dt><span class="term"><code class="literal">peer</code> (remote version)</span></dt><dd><p>Remote client peer in a backup pool.</p><p>
                     This is a subsection which contains information about a
                     specific remote client peer managed by a master.  A remote
                     peer is one which can be reached via an rsh-based network
                     call.
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one remote or local peer must be
                     configured.
                  </p><p>
                     The remote peer subsection must contain the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">name</code></span></dt><dd><p>Hostname of the peer.</p><p>
                              For remote peers, this must be a valid DNS
                              hostname or IP address which can be resolved
                              during an rsh-based network call.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty,
                              and unique among all peers.
                           </p></dd><dt><span class="term"><code class="literal">type</code></span></dt><dd><p>Type of this peer.</p><p>
                              This value identifies the type of the peer.  For
                              a remote peer, it must always be <code class="literal">remote</code>.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be <code class="literal">remote</code>.
                           </p></dd><dt><span class="term"><code class="literal">managed</code></span></dt><dd><p>Indicates whether this peer is managed.</p><p>
                              A managed peer (or managed client) is a peer for
                              which the master manages all of the backup
                              activites via a remote shell.
                           </p><p>
                              This field is optional.  If it doesn't exist, then
                              <code class="literal">N</code> will be assumed.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean (<code class="literal">Y</code> or <code class="literal">N</code>).
                           </p></dd><dt><span class="term"><code class="literal">collect_dir</code></span></dt><dd><p>Collect directory to stage from for this peer.</p><p>
                              The master will copy all files in this directory
                              into the appropriate staging directory.  Since
                              this is a remote peer, the directory is assumed to
                              be reachable via rsh-based network operations
                              (i.e. <span class="command"><strong>scp</strong></span> or the configured
                              rcp command). 
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">ignore_failures</code></span></dt><dd><p>Ignore failure mode for this peer</p><p>
                              The ignore failure mode indicates whether
                              <span class="quote">&#8220;<span class="quote">not ready to be staged</span>&#8221;</span> errors
                              should be ignored for this peer.  This option is
                              intended to be used for peers that are up only
                              intermittently, to cut down on the number of
                              error emails received by the Cedar Backup
                              administrator.
                           </p><p>
                              The "none" mode means that all errors will be
                              reported.  This is the default behavior.  The
                              "all" mode means to ignore all failures.  The
                              "weekly" mode means to ignore failures for a
                              start-of-week or full backup.  The "daily" mode
                              means to ignore failures for any backup that is
                              not either a full backup or a start-of-week
                              backup.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> If set, must
                              be one of "none", "all", "daily", or "weekly".
                           </p></dd><dt><span class="term"><code class="literal">backup_user</code></span></dt><dd><p>Name of backup user on the remote peer.</p><p>
                              This username will be used when copying files from
                              the remote peer via an rsh-based network connection.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default backup user from the 
                              options section.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                           </p></dd><dt><span class="term"><code class="literal">rcp_command</code></span></dt><dd><p>The rcp-compatible copy command for this peer.</p><p>
                              The rcp command should be the exact command used for
                              remote copies, including any required options.  If you are
                              using <span class="command"><strong>scp</strong></span>, you should pass it the
                              <code class="option">-B</code> option, so <span class="command"><strong>scp</strong></span> will
                              not ask for any user input (which could hang the backup).
                              A common example is something like <span class="command"><strong>/usr/bin/scp
                              -B</strong></span>.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default rcp command from the
                              options section.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                           </p></dd><dt><span class="term"><code class="literal">rsh_command</code></span></dt><dd><p>The rsh-compatible command for this peer.</p><p>
                              The rsh command should be the exact command used for
                              remote shells, including any required options.  
                           </p><p>
                              This value only applies if the peer is managed.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default rsh command from the
                              options section.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                           </p></dd><dt><span class="term"><code class="literal">cback_command</code></span></dt><dd><p>The cback-compatible command for this peer.</p><p>
                              The cback command should be the exact command
                              used for for executing cback on the peer as part
                              of a managed backup.  This value must include any
                              required command-line options.  Do
                              <span class="emphasis"><em>not</em></span> list any actions in the
                              command line, and do <span class="emphasis"><em>not</em></span>
                              include the <span class="command"><strong>--full</strong></span>
                              command-line option.
                           </p><p>
                              This value only applies if the peer is managed.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default cback command from the
                              options section.
                           </p><p>
                              Note: if this command-line is complicated, it is often
                              better to create a simple shell script on the remote host
                              to encapsulate all of the options.  Then, just reference
                              the shell script in configuration.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                           </p></dd><dt><span class="term"><code class="literal">managed_actions</code></span></dt><dd><p>Set of actions that are managed for this peer.</p><p>
                              This is a comma-separated list of actions that
                              the master will manage on behalf this peer.
                              Typically, it would include only collect-like
                              actions and purge.  
                           </p><p>
                              This value only applies if the peer is managed.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default list of managed
                              actions from the options section.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.  
                           </p></dd></dl></div></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-collect"></a>Collect Configuration</h3></div></div></div><p>
            The collect configuration section contains configuration options
            related the the collect action.  This section contains a variable
            number of elements, including an optional exclusion section and a
            repeating subsection used to specify which directories and/or files
            to collect.  You can also configure an ignore indicator file, which
            lets users mark their own directories as not backed up.
         </p><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Using a Link Farm</b></p></div></div></div><p>
               Sometimes, it's not very convenient to list directories one by
               one in the Cedar Backup configuration file.  For instance, when
               backing up your home directory, you often exclude as many
               directories as you include.  The ignore file mechanism can be of
               some help, but it still isn't very convenient if there are a lot
               of directories to ignore (or if new directories pop up all of the
               time).
            </p><p>
               In this situation, one option is to use a <em class="firstterm">link
               farm</em> rather than listing all of the directories in
               configuration.  A link farm is a directory that contains nothing
               but a set of soft links to other files and directories.
               Normally, Cedar Backup does not follow soft links, but you can
               override this behavior for individual directories using the
               <code class="literal">link_depth</code> and <code class="literal">dereference</code> 
               options (see below).  
            </p><p>
               When using a link farm, you still have to deal with each
               backed-up directory individually, but you don't have to modify
               configuration.  Some users find that this works better for them.
            </p></div><p>
            In order to actually execute the collect action, you must have
            configured at least one collect directory or one collect file.
            However, if you are only including collect configuration for use by
            an extension, then it's OK to leave out these sections.  The
            validation will take place only when the collect action is
            executed.
         </p><p>
            This is an example collect configuration section:
         </p><pre class="programlisting">
&lt;collect&gt;
   &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
   &lt;collect_mode&gt;daily&lt;/collect_mode&gt;
   &lt;archive_mode&gt;targz&lt;/archive_mode&gt;
   &lt;ignore_file&gt;.cbignore&lt;/ignore_file&gt;
   &lt;exclude&gt;
      &lt;abs_path&gt;/etc&lt;/abs_path&gt;
      &lt;pattern&gt;.*\.conf&lt;/pattern&gt;
   &lt;/exclude&gt;
   &lt;file&gt;
      &lt;abs_path&gt;/home/root/.profile&lt;/abs_path&gt;
   &lt;/file&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/etc&lt;/abs_path&gt;
   &lt;/dir&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/var/log&lt;/abs_path&gt;
      &lt;collect_mode&gt;incr&lt;/collect_mode&gt;
   &lt;/dir&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/opt&lt;/abs_path&gt;
      &lt;collect_mode&gt;weekly&lt;/collect_mode&gt;
      &lt;exclude&gt;
         &lt;abs_path&gt;/opt/large&lt;/abs_path&gt;
         &lt;rel_path&gt;backup&lt;/rel_path&gt;
         &lt;pattern&gt;.*tmp&lt;/pattern&gt;
      &lt;/exclude&gt;
   &lt;/dir&gt;
&lt;/collect&gt;
         </pre><p>
            The following elements are part of the collect configuration
            section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_dir</code></span></dt><dd><p>Directory to collect files into.</p><p>
                     On a client, this is the directory which tarfiles for
                     individual collect directories are written into.  The
                     master then stages files from this directory into its own
                     staging directory.
                  </p><p>
                     This field is always required.  It must contain enough
                     free space to collect all of the backed-up files on the
                     machine in a compressed form.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path
                  </p></dd><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Default collect mode.</p><p>
                     The collect mode describes how frequently a directory is
                     backed up.   See <a class="xref" href="#cedar-basic-process-collect" title="The Collect Action">the section called &#8220;The Collect Action&#8221;</a> (in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>) for more information.
                  </p><p>
                     This value is the collect mode that will be used by
                     default during the collect process.  Individual collect
                     directories (below) may override this value.  If
                     <span class="emphasis"><em>all</em></span> individual directories provide
                     their own value, then this default value may be omitted
                     from configuration.
                  </p><p>
                     Note: if your backup device does not suppport multisession
                     discs, then you should probably use the
                     <code class="literal">daily</code> collect mode to avoid losing
                     data.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                     <code class="literal">daily</code>, <code class="literal">weekly</code> or
                     <code class="literal">incr</code>.
                  </p></dd><dt><span class="term"><code class="literal">archive_mode</code></span></dt><dd><p>Default archive mode for collect files.</p><p>
                     The archive mode maps to the way that a backup file is
                     stored.  A value <code class="literal">tar</code> means just a
                     tarfile (<code class="filename">file.tar</code>); a value
                     <code class="literal">targz</code> means a gzipped tarfile
                     (<code class="filename">file.tar.gz</code>); and a value
                     <code class="literal">tarbz2</code> means a bzipped tarfile
                     (<code class="filename">file.tar.bz2</code>)
                  </p><p>
                     This value is the archive mode that will be used by
                     default during the collect process.  Individual collect
                     directories (below) may override this value.  If
                     <span class="emphasis"><em>all</em></span> individual directories provide
                     their own value, then this default value may be omitted
                     from configuration.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                     <code class="literal">tar</code>, <code class="literal">targz</code> or
                     <code class="literal">tarbz2</code>.
                  </p></dd><dt><span class="term"><code class="literal">ignore_file</code></span></dt><dd><p>Default ignore file name.</p><p>
                     The ignore file is an indicator file.  If it exists in a
                     given directory, then that directory will be recursively
                     excluded from the backup as if it were explicitly excluded
                     in configuration.  
                  </p><p>
                     The ignore file provides a way for individual users (who
                     might not have access to Cedar Backup configuration) to
                     control which of their own directories get backed up.  For
                     instance, users with a <code class="filename">~/tmp</code>
                     directory might not want it backed up.  If they create an
                     ignore file in their directory (e.g.
                     <code class="filename">~/tmp/.cbignore</code>), then Cedar Backup
                     will ignore it.
                  </p><p>
                     This value is the ignore file name that will be used by
                     default during the collect process.  Individual collect
                     directories (below) may override this value.  If
                     <span class="emphasis"><em>all</em></span> individual directories provide
                     their own value, then this default value may be omitted
                     from configuration.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                  </p></dd><dt><span class="term"><code class="literal">recursion_level</code></span></dt><dd><p>Recursion level to use when collecting directories.</p><p>
                     This is an integer value that Cedar Backup will consider
                     when generating archive files for a configured collect
                     directory.
                  </p><p>
                     Normally, Cedar Backup generates one archive file per
                     collect directory.  So, if you collect
                     <code class="literal">/etc</code> you get
                     <code class="literal">etc.tar.gz</code>.  Most of the time, this is
                     what you want.  However, you may sometimes wish to
                     generate multiple archive files for a single collect
                     directory.  
                  </p><p>
                     The most obvious example is for <code class="literal">/home</code>.
                     By default, Cedar Backup will generate
                     <code class="literal">home.tar.gz</code>. If instead, you want one
                     archive file per home directory you can set a recursion
                     level of <code class="literal">1</code>.  Cedar Backup will generate
                     <code class="literal">home-user1.tar.gz</code>,
                     <code class="literal">home-user2.tar.gz</code>, etc.
                  </p><p>
                     Higher recursion levels (<code class="literal">2</code>,
                     <code class="literal">3</code>, etc.) are legal, and it doesn't
                     matter if the configured recursion level is deeper than
                     the directory tree that is being collected.  You can use a
                     negative recursion level (like <code class="literal">-1</code>) to
                     specify an infinite level of recursion.  This will exhaust
                     the tree in the same way as if the recursion level is set
                     too high.
                  </p><p>
                     This field is optional.  if it doesn't exist, the backup
                     will use the default recursion level of zero.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be an integer.
                  </p></dd><dt><span class="term"><code class="literal">exclude</code></span></dt><dd><p>List of paths or patterns to exclude from the backup.</p><p>
                     This is a subsection which contains a set of absolute
                     paths and patterns to be excluded across all configured
                     directories.  For a given directory, the set of absolute
                     paths and patterns to exclude is built from this list and
                     any list that exists on the directory itself.  Directories
                     <span class="emphasis"><em>cannot</em></span> override or remove entries that
                     are in this list, however.
                  </p><p>
                     This section is optional, and if it exists can also be
                     empty.  
                  </p><p>
                     The exclude subsection can contain one or more of each of
                     the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                              An absolute path to be recursively excluded from
                              the backup.
                           </p><p>
                              If a directory is excluded, then all of its children
                              are also recursively excluded.  For instance, a value
                              <code class="literal">/var/log/apache</code> would exclude any
                              files within <code class="filename">/var/log/apache</code> as
                              well as files within other directories under
                              <code class="filename">/var/log/apache</code>.
                           </p><p>
                              This field can be repeated as many times as is
                              necessary.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">pattern</code></span></dt><dd><p>
                              A pattern to be recursively excluded from the
                              backup.
                           </p><p>
                              The pattern must be a Python regular expression. 
                              <a href="#ftn.cedar-config-foot-regex" class="footnote" name="cedar-config-foot-regex"><sup class="footnote">[21]</sup></a>
                              It is assumed to be bounded at front and back by the beginning
                              and end of the string (i.e. it is treated as if it begins with
                              <code class="literal">^</code> and ends with <code class="literal">$</code>).
                           </p><p>
                              If the pattern causes a directory to be excluded,
                              then all of the children of that directory are
                              also recursively excluded.  For instance, a value
                              <code class="literal">.*apache.*</code> might match the
                              <code class="filename">/var/log/apache</code> directory.
                              This would exclude any files within
                              <code class="filename">/var/log/apache</code> as well as
                              files within other directories under
                              <code class="filename">/var/log/apache</code>.
                           </p><p>
                              This field can be repeated as many times as is
                              necessary.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                           </p></dd></dl></div></dd><dt><span class="term"><code class="literal">file</code></span></dt><dd><p>A file to be collected.</p><p>
                     This is a subsection which contains information about
                     a specific file to be collected (backed up).
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one collect directory or collect file
                     must be configured when the collect action is executed.
                  </p><p>
                     The collect file subsection contains the following
                     fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                              Absolute path of the file to collect.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Collect mode for this file</p><p>
                              The collect mode describes how frequently a
                              file is backed up.   See <a class="xref" href="#cedar-basic-process-collect" title="The Collect Action">the section called &#8220;The Collect Action&#8221;</a> (in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>) for more information.
                           </p><p>
                              This field is optional.  If it doesn't exist, the
                              backup will use the default collect mode.  
                           </p><p>
                              Note: if your backup device does not suppport
                              multisession discs, then you should probably
                              confine yourself to the <code class="literal">daily</code>
                              collect mode, to avoid losing data.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                              <code class="literal">daily</code>, <code class="literal">weekly</code> or
                              <code class="literal">incr</code>.
                           </p></dd><dt><span class="term"><code class="literal">archive_mode</code></span></dt><dd><p>Archive mode for this file.</p><p>
                              The archive mode maps to the way that a backup
                              file is stored.  A value <code class="literal">tar</code>
                              means just a tarfile
                              (<code class="filename">file.tar</code>); a value
                              <code class="literal">targz</code> means a gzipped tarfile
                              (<code class="filename">file.tar.gz</code>); and a value
                              <code class="literal">tarbz2</code> means a bzipped tarfile
                              (<code class="filename">file.tar.bz2</code>)
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default archive mode.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                              <code class="literal">tar</code>, <code class="literal">targz</code> or
                              <code class="literal">tarbz2</code>.
                           </p></dd></dl></div></dd><dt><span class="term"><code class="literal">dir</code></span></dt><dd><p>A directory to be collected.</p><p>
                     This is a subsection which contains information about
                     a specific directory to be collected (backed up).
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one collect directory or collect file
                     must be configured when the collect action is executed.
                  </p><p>
                     The collect directory subsection contains the following
                     fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                              Absolute path of the directory to collect.
                           </p><p>
                              The path may be either a directory, a soft link
                              to a directory, or a hard link to a directory.
                              All three are treated the same at this level.
                           </p><p>
                              The contents of the directory will be recursively
                              collected.  The backup will contain all of the
                              files in the directory, as well as the contents
                              of all of the subdirectories within the
                              directory, etc.  
                           </p><p>
                              Soft links <span class="emphasis"><em>within</em></span> the
                              directory are treated as files, i.e. they are
                              copied verbatim (as a link) and their contents
                              are not backed up.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Collect mode for this directory</p><p>
                              The collect mode describes how frequently a
                              directory is backed up.   See <a class="xref" href="#cedar-basic-process-collect" title="The Collect Action">the section called &#8220;The Collect Action&#8221;</a> (in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>) for more information.
                           </p><p>
                              This field is optional.  If it doesn't exist, the
                              backup will use the default collect mode.  
                           </p><p>
                              Note: if your backup device does not suppport
                              multisession discs, then you should probably
                              confine yourself to the <code class="literal">daily</code>
                              collect mode, to avoid losing data.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                              <code class="literal">daily</code>, <code class="literal">weekly</code> or
                              <code class="literal">incr</code>.
                           </p></dd><dt><span class="term"><code class="literal">archive_mode</code></span></dt><dd><p>Archive mode for this directory.</p><p>
                              The archive mode maps to the way that a backup
                              file is stored.  A value <code class="literal">tar</code>
                              means just a tarfile
                              (<code class="filename">file.tar</code>); a value
                              <code class="literal">targz</code> means a gzipped tarfile
                              (<code class="filename">file.tar.gz</code>); and a value
                              <code class="literal">tarbz2</code> means a bzipped tarfile
                              (<code class="filename">file.tar.bz2</code>)
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default archive mode.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                              <code class="literal">tar</code>, <code class="literal">targz</code> or
                              <code class="literal">tarbz2</code>.
                           </p></dd><dt><span class="term"><code class="literal">ignore_file</code></span></dt><dd><p>Ignore file name for this directory.</p><p>
                              The ignore file is an indicator file.  If it
                              exists in a given directory, then that directory
                              will be recursively excluded from the backup as
                              if it were explicitly excluded in configuration.  
                           </p><p>
                              The ignore file provides a way for individual
                              users (who might not have access to Cedar Backup
                              configuration) to control which of their own
                              directories get backed up.  For instance, users
                              with a <code class="filename">~/tmp</code> directory might
                              not want it backed up.  If they create an ignore
                              file in their directory (e.g.
                              <code class="filename">~/tmp/.cbignore</code>), then Cedar
                              Backup will ignore it.
                           </p><p>
                              This field is optional.  If it doesn't exist, the
                              backup will use the default ignore file name.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                           </p></dd><dt><span class="term"><code class="literal">link_depth</code></span></dt><dd><p>Link depth value to use for this directory.</p><p>
                              The link depth is maximum depth of the tree at
                              which soft links should be followed.  So, a depth
                              of 0 does not follow any soft links within the
                              collect directory, a depth of 1 follows only
                              links immediately within the collect directory, a
                              depth of 2 follows the links at the next level
                              down, etc.
                           </p><p>
                              This field is optional.  If it doesn't exist,
                              the backup will assume a value of zero, meaning
                              that soft links within the collect directory will
                              never be followed.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> If set, must
                              be an integer &#8805; 0.
                           </p></dd><dt><span class="term"><code class="literal">dereference</code></span></dt><dd><p>Whether to dereference soft links.</p><p>
                              If this flag is set, links that are being
                              followed will be dereferenced before being added
                              to the backup.  The link will be added (as a
                              link), and then the directory or file that the
                              link points at will be added as well.  
                           </p><p>
                              This value only applies to a directory where soft
                              links are being followed (per the
                              <code class="literal">link_depth</code> configuration
                              option).  It never applies to a configured
                              collect directory itself, only to other
                              directories within the collect directory.
                           </p><p>
                              This field is optional.  If it doesn't exist,
                              the backup will assume that links should never be
                              dereferenced.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a
                              boolean (<code class="literal">Y</code> or
                              <code class="literal">N</code>).
                           </p></dd><dt><span class="term"><code class="literal">exclude</code></span></dt><dd><p>List of paths or patterns to exclude from the backup.</p><p>
                              This is a subsection which contains a set of
                              paths and patterns to be excluded within this
                              collect directory.  This list is combined with
                              the program-wide list to build a complete list
                              for the directory.
                           </p><p>
                              This section is entirely optional, and if it exists can
                              also be empty.  
                           </p><p>
                              The exclude subsection can contain one or more of each of
                              the following fields:
                           </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                                       An absolute path to be recursively
                                       excluded from the backup.
                                    </p><p>
                                       If a directory is excluded, then all of
                                       its children are also recursively
                                       excluded.  For instance, a value
                                       <code class="literal">/var/log/apache</code> would
                                       exclude any files within
                                       <code class="filename">/var/log/apache</code> as
                                       well as files within other directories
                                       under
                                       <code class="filename">/var/log/apache</code>.
                                    </p><p>
                                       This field can be repeated as many times as is
                                       necessary.
                                    </p><p>
                                       <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                                    </p></dd><dt><span class="term"><code class="literal">rel_path</code></span></dt><dd><p>
                                       A relative path to be recursively
                                       excluded from the backup.
                                    </p><p>
                                       The path is assumed to be relative to
                                       the collect directory itself.  For
                                       instance, if the configured directory is
                                       <code class="filename">/opt/web</code> a
                                       configured relative path of
                                       <code class="filename">something/else</code>
                                       would exclude the path
                                       <code class="filename">/opt/web/something/else</code>.
                                    </p><p>
                                       If a directory is excluded, then all of
                                       its children are also recursively
                                       excluded.  For instance, a value
                                       <code class="literal">something/else</code> would
                                       exclude any files within
                                       <code class="filename">something/else</code> as
                                       well as files within other directories
                                       under <code class="filename">something/else</code>.
                                    </p><p>
                                       This field can be repeated as many times as is
                                       necessary.
                                    </p><p>
                                       <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                                    </p></dd><dt><span class="term"><code class="literal">pattern</code></span></dt><dd><p>
                                       A pattern to be excluded from the backup.
                                    </p><p>
                                       The pattern must be a Python regular
                                       expression.  <a href="#ftn.cedar-config-foot-regex" class="footnoteref"><sup class="footnoteref">[21]</sup></a> 
                                       It is assumed to be bounded at front and
                                       back by the beginning and end of the
                                       string (i.e. it is treated as if it
                                       begins with <code class="literal">^</code> and
                                       ends with <code class="literal">$</code>).
                                    </p><p>
                                       If the pattern causes a directory to be
                                       excluded, then all of the children of
                                       that directory are also recursively
                                       excluded.  For instance, a value
                                       <code class="literal">.*apache.*</code> might
                                       match the <code class="filename">/var/log/apache</code>
                                       directory.  This would exclude any files
                                       within <code class="filename">/var/log/apache</code> as
                                       well as files within other directories
                                       under <code class="filename">/var/log/apache</code>.
                                    </p><p>
                                       This field can be repeated as many times as is
                                       necessary.
                                    </p><p>
                                       <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                                    </p></dd></dl></div></dd></dl></div></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-stage"></a>Stage Configuration</h3></div></div></div><p>
            The stage configuration section contains configuration options
            related the the stage action.  The section indicates where date
            from peers can be staged to.  
         </p><p>
            This section can also (optionally) override the list of peers so
            that not all peers are staged.  If you provide
            <span class="emphasis"><em>any</em></span> peers in this section, then the list of
            peers here completely replaces the list of peers in the peers
            configuration section for the purposes of staging.
         </p><p>
            This is an example stage configuration section for the simple case
            where the list of peers is taken from peers configuration:
         </p><pre class="programlisting">
&lt;stage&gt;
   &lt;staging_dir&gt;/opt/backup/stage&lt;/staging_dir&gt;
&lt;/stage&gt;
         </pre><p>
            This is an example stage configuration section that overrides the
            default list of peers:
         </p><pre class="programlisting">
&lt;stage&gt;
   &lt;staging_dir&gt;/opt/backup/stage&lt;/staging_dir&gt;
   &lt;peer&gt;
      &lt;name&gt;machine1&lt;/name&gt;
      &lt;type&gt;local&lt;/type&gt;
      &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
   &lt;/peer&gt;
   &lt;peer&gt;
      &lt;name&gt;machine2&lt;/name&gt;
      &lt;type&gt;remote&lt;/type&gt;
      &lt;backup_user&gt;backup&lt;/backup_user&gt;
      &lt;collect_dir&gt;/opt/backup/collect&lt;/collect_dir&gt;
   &lt;/peer&gt;
&lt;/stage&gt;
         </pre><p>
            The following elements are part of the stage configuration section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">staging_dir</code></span></dt><dd><p>Directory to stage files into.</p><p>
                     This is the directory into which the master stages collected
                     data from each of the clients.  Within the staging directory,
                     data is staged into date-based directories by peer name.  For
                     instance, peer <span class="quote">&#8220;<span class="quote">daystrom</span>&#8221;</span> backed up on 19 Feb 2005
                     would be staged into something like <code class="filename">2005/02/19/daystrom</code>
                     relative to the staging directory itself.
                  </p><p>
                     This field is always required.  The directory must contain
                     enough free space to stage all of the files collected from
                     all of the various machines in a backup pool.  Many
                     administrators set up purging to keep staging directories
                     around for a week or more, which requires even more space.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path
                  </p></dd><dt><span class="term"><code class="literal">peer</code> (local version)</span></dt><dd><p>Local client peer in a backup pool.</p><p>
                     This is a subsection which contains information about a
                     specific local client peer to be staged (backed up).  A
                     local peer is one whose collect directory can be reached
                     without requiring any rsh-based network calls.  It is
                     possible that a remote peer might be staged as a local
                     peer if its collect directory is mounted to the master via
                     NFS, AFS or some other method.
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one remote or local peer must be
                     configured.
                  </p><p>
                     <span class="emphasis"><em>Remember</em></span>, if you provide
                     <span class="emphasis"><em>any</em></span> local or remote peer in staging
                     configuration, the global peer configuration is completely
                     replaced by the staging peer configuration.
                  </p><p>
                     The local peer subsection must contain the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">name</code></span></dt><dd><p>Name of the peer, typically a valid hostname.</p><p>
                              For local peers, this value is only used for
                              reference.  However, it is good practice to list
                              the peer's hostname here, for consistency with
                              remote peers.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty,
                              and unique among all peers.
                           </p></dd><dt><span class="term"><code class="literal">type</code></span></dt><dd><p>Type of this peer.</p><p>
                              This value identifies the type of the peer.  For
                              a local peer, it must always be <code class="literal">local</code>.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be <code class="literal">local</code>.
                           </p></dd><dt><span class="term"><code class="literal">collect_dir</code></span></dt><dd><p>Collect directory to stage from for this peer.</p><p>
                              The master will copy all files in this directory
                              into the appropriate staging directory.  Since
                              this is a local peer, the directory is assumed to
                              be reachable via normal filesystem operations
                              (i.e. <span class="command"><strong>cp</strong></span>). 
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd></dl></div></dd><dt><span class="term"><code class="literal">peer</code> (remote version)</span></dt><dd><p>Remote client peer in a backup pool.</p><p>
                     This is a subsection which contains information about a
                     specific remote client peer to be staged (backed up).  A
                     remote peer is one whose collect directory can only be
                     reached via an rsh-based network call.
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one remote or local peer must be
                     configured.
                  </p><p>
                     <span class="emphasis"><em>Remember</em></span>, if you provide
                     <span class="emphasis"><em>any</em></span> local or remote peer in staging
                     configuration, the global peer configuration is completely
                     replaced by the staging peer configuration.
                  </p><p>
                     The remote peer subsection must contain the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">name</code></span></dt><dd><p>Hostname of the peer.</p><p>
                              For remote peers, this must be a valid DNS
                              hostname or IP address which can be resolved
                              during an rsh-based network call.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty,
                              and unique among all peers.
                           </p></dd><dt><span class="term"><code class="literal">type</code></span></dt><dd><p>Type of this peer.</p><p>
                              This value identifies the type of the peer.  For
                              a remote peer, it must always be <code class="literal">remote</code>.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be <code class="literal">remote</code>.
                           </p></dd><dt><span class="term"><code class="literal">collect_dir</code></span></dt><dd><p>Collect directory to stage from for this peer.</p><p>
                              The master will copy all files in this directory
                              into the appropriate staging directory.  Since
                              this is a remote peer, the directory is assumed to
                              be reachable via rsh-based network operations
                              (i.e. <span class="command"><strong>scp</strong></span> or the configured
                              rcp command). 
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">backup_user</code></span></dt><dd><p>Name of backup user on the remote peer.</p><p>
                              This username will be used when copying files from
                              the remote peer via an rsh-based network connection.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default backup user from the 
                              options section.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                           </p></dd><dt><span class="term"><code class="literal">rcp_command</code></span></dt><dd><p>The rcp-compatible copy command for this peer.</p><p>
                              The rcp command should be the exact command used for
                              remote copies, including any required options.  If you are
                              using <span class="command"><strong>scp</strong></span>, you should pass it the
                              <code class="option">-B</code> option, so <span class="command"><strong>scp</strong></span> will
                              not ask for any user input (which could hang the backup).
                              A common example is something like <span class="command"><strong>/usr/bin/scp
                              -B</strong></span>.
                           </p><p>
                              This field is optional.  if it doesn't exist, the
                              backup will use the default rcp command from the
                              options section.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                           </p></dd></dl></div></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-store"></a>Store Configuration</h3></div></div></div><p>
            The store configuration section contains configuration options
            related the the store action.  This section contains several
            optional fields.  Most fields control the way media is written
            using the writer device.
         </p><p>
            This is an example store configuration section:
         </p><pre class="programlisting">
&lt;store&gt;
   &lt;source_dir&gt;/opt/backup/stage&lt;/source_dir&gt;
   &lt;media_type&gt;cdrw-74&lt;/media_type&gt;
   &lt;device_type&gt;cdwriter&lt;/device_type&gt;
   &lt;target_device&gt;/dev/cdrw&lt;/target_device&gt;
   &lt;target_scsi_id&gt;0,0,0&lt;/target_scsi_id&gt;
   &lt;drive_speed&gt;4&lt;/drive_speed&gt;
   &lt;check_data&gt;Y&lt;/check_data&gt;
   &lt;check_media&gt;Y&lt;/check_media&gt;
   &lt;warn_midnite&gt;Y&lt;/warn_midnite&gt;
   &lt;no_eject&gt;N&lt;/no_eject&gt;
   &lt;refresh_media_delay&gt;15&lt;/refresh_media_delay&gt;
   &lt;eject_delay&gt;2&lt;/eject_delay&gt;
   &lt;blank_behavior&gt;
      &lt;mode&gt;weekly&lt;/mode&gt;
      &lt;factor&gt;1.3&lt;/factor&gt;
   &lt;/blank_behavior&gt;
&lt;/store&gt;
         </pre><p>
            The following elements are part of the store configuration section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">source_dir</code></span></dt><dd><p>Directory whose contents should be written to media.</p><p>
                     This directory <span class="emphasis"><em>must</em></span> be a Cedar Backup
                     staging directory, as configured in the staging configuration
                     section.  Only certain data from that directory (typically,
                     data from the current day) will be written to disc.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path
                  </p></dd><dt><span class="term"><code class="literal">device_type</code></span></dt><dd><p>Type of the device used to write the media.</p><p>
                     This field controls which type of writer device will be
                     used by Cedar Backup.  Currently, Cedar Backup supports CD
                     writers (<code class="literal">cdwriter</code>) and DVD writers
                     (<code class="literal">dvdwriter</code>).
                  </p><p>
                     This field is optional.  If it doesn't exist, the
                     <code class="literal">cdwriter</code> device type is assumed.  
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> If set, must be either <code class="literal">cdwriter</code>
                     or <code class="literal">dvdwriter</code>.
                  </p></dd><dt><span class="term"><code class="literal">media_type</code></span></dt><dd><p>Type of the media in the device.</p><p>
                     Unless you want to throw away a backup disc every week,
                     you are probably best off using rewritable media.
                  </p><p>
                     You must choose a media type that is appropriate for the
                     device type you chose above.  For more information on
                     media types, see <a class="xref" href="#cedar-basic-mediadevice" title="Media and Device Types">the section called &#8220;Media and Device Types&#8221;</a>
                     (in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>).
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                     <code class="literal">cdr-74</code>, <code class="literal">cdrw-74</code>, 
                     <code class="literal">cdr-80</code> or <code class="literal">cdrw-80</code>
                     if device type is <code class="literal">cdwriter</code>; or one
                     of <code class="literal">dvd+r</code> or <code class="literal">dvd+rw</code>
                     if device type is <code class="literal">dvdwriter</code>.
                  </p></dd><dt><span class="term"><code class="literal">target_device</code></span></dt><dd><p>Filesystem device name for writer device.</p><p>
                     This value is required for both CD writers and DVD
                     writers.  
                  </p><p>
                     This is the UNIX device name for the writer drive, for
                     instance <code class="filename">/dev/scd0</code> or a symlink
                     like <code class="filename">/dev/cdrw</code>.   
                  </p><p>
                     In some cases, this device name is used to directly write
                     to media.  This is true all of the time for DVD writers,
                     and is true for CD writers when a SCSI id (see below) has
                     not been specified.  
                  </p><p>
                     Besides this, the device name is also needed in order to
                     do several pre-write checks (such as whether the device
                     might already be mounted) as well as the post-write
                     consistency check, if enabled.
                  </p><p>
                     Note: some users have reported intermittent problems when
                     using a symlink as the target device on Linux, especially
                     with DVD media.  If you experience problems, try using the
                     real device name rather than the symlink.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                  </p></dd><dt><span class="term"><code class="literal">target_scsi_id</code></span></dt><dd><p>SCSI id for the writer device.</p><p>
                     This value is optional for CD writers and is ignored for
                     DVD writers.
                  </p><p>
                     If you have configured your CD writer hardware to work
                     through the normal filesystem device path, then you can
                     leave this parameter unset.  Cedar Backup will just use
                     the target device (above) when talking to
                     <span class="command"><strong>cdrecord</strong></span>.
                  </p><p>
                     Otherwise, if you have SCSI CD writer hardware or you have
                     configured your non-SCSI hardware to operate like a SCSI
                     device, then you need to provide Cedar Backup with a SCSI
                     id it can use when talking with
                     <span class="command"><strong>cdrecord</strong></span>.
                  </p><p>
                     For the purposes of Cedar Backup, a valid SCSI identifier
                     must either be in the standard SCSI identifier form
                     <code class="literal">scsibus,target,lun</code> or in the
                     specialized-method form
                     <code class="literal">&lt;method&gt;:scsibus,target,lun</code>.
                  </p><p>
                     An example of a standard SCSI identifier is
                     <code class="literal">1,6,2</code>. Today, the two most common examples
                     of the specialized-method form are
                     <code class="literal">ATA:scsibus,target,lun</code> and
                     <code class="literal">ATAPI:scsibus,target,lun</code>, but you may
                     occassionally see other values (like
                     <code class="literal">OLDATAPI</code> in some forks of
                     <span class="command"><strong>cdrecord</strong></span>).
                  </p><p>
                     See <a class="xref" href="#cedar-config-writer" title="Configuring your Writer Device">the section called &#8220;Configuring your Writer Device&#8221;</a> for more
                     information on writer devices and how they are configured.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> If set, must be a valid SCSI identifier.
                  </p></dd><dt><span class="term"><code class="literal">drive_speed</code></span></dt><dd><p>Speed of the drive, i.e. <code class="literal">2</code> for a 2x device.</p><p>
                     This field is optional.  If it doesn't exist, the
                     underlying device-related functionality will use the
                     default drive speed.  
                  </p><p>
                     For DVD writers, it is best to leave this value unset, so
                     <span class="command"><strong>growisofs</strong></span> can pick an appropriate
                     speed.  For CD writers, since media can be
                     speed-sensitive, it is probably best to set a sensible
                     value based on your specific writer and media.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> If set, must be an integer &#8805; 1.
                  </p></dd><dt><span class="term"><code class="literal">check_data</code></span></dt><dd><p>Whether the media should be validated.</p><p>
                     This field indicates whether a resulting image on the
                     media should be validated after the write completes, by
                     running a consistency check against it.  If this check is
                     enabled, the contents of the staging directory are
                     directly compared to the media, and an error is reported
                     if there is a mismatch.
                  </p><p>
                     Practice shows that some drives can encounter an error
                     when writing a multisession disc, but not report any problems.
                     This consistency check allows us to catch the problem.
                     By default, the consistency check is disabled, but most
                     users should choose to enable it unless they have a good
                     reason not to.
                  </p><p>
                     This field is optional.  If it doesn't exist, then
                     <code class="literal">N</code> will be assumed.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean (<code class="literal">Y</code> or <code class="literal">N</code>).
                  </p></dd><dt><span class="term"><code class="literal">check_media</code></span></dt><dd><p>Whether the media should be checked before writing to it.</p><p>
                     By default, Cedar Backup does not check its media before
                     writing to it.  It will write to any media in the backup
                     device.  If you set this flag to Y, Cedar Backup will make
                     sure that the media has been initialized before writing to
                     it.  (Rewritable media is initialized using the initialize
                     action.)
                  </p><p>
                     If the configured media is not rewritable (like CD-R),
                     then this behavior is modified slightly.  For this kind of
                     media, the check passes either if the media has been
                     initialized <span class="emphasis"><em>or</em></span> if the media appears
                     unused.
                  </p><p>
                     This field is optional.  If it doesn't exist, then
                     <code class="literal">N</code> will be assumed.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean (<code class="literal">Y</code> or <code class="literal">N</code>).
                  </p></dd><dt><span class="term"><code class="literal">warn_midnite</code></span></dt><dd><p>Whether to generate warnings for crossing midnite.</p><p>
                     This field indicates whether warnings should be generated
                     if the store operation has to cross a midnite boundary in
                     order to find data to write to disc.  For instance, a
                     warning would be generated if valid store data was only
                     found in the day before or day after the current day.
                  </p><p>
                     Configuration for some users is such that the store
                     operation will always cross a midnite boundary, so they
                     will not care about this warning.  Other users will expect
                     to never cross a boundary, and want to be notified that
                     something <span class="quote">&#8220;<span class="quote">strange</span>&#8221;</span> might have happened.
                  </p><p>
                     This field is optional.  If it doesn't exist, then
                     <code class="literal">N</code> will be assumed.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean (<code class="literal">Y</code> or <code class="literal">N</code>).
                  </p></dd><dt><span class="term"><code class="literal">no_eject</code></span></dt><dd><p>Indicates that the writer device should not be ejected.</p><p>
                     Under some circumstances, Cedar Backup ejects (opens and
                     closes) the writer device.  This is done because some
                     writer devices need to re-load the media before noticing a
                     media state change (like a new session).
                  </p><p>
                     For most writer devices this is safe, because they have a
                     tray that can be opened and closed.  If your writer device
                     does not have a tray <span class="emphasis"><em>and</em></span> Cedar Backup
                     does not properly detect this, then set this flag.  Cedar
                     Backup will not ever issue an eject command to your
                     writer.
                  </p><p>
                     Note: this could cause problems with your backup.  For
                     instance, with many writers, the check data step may fail
                     if the media is not reloaded first.  If this happens to
                     you, you may need to get a different writer device.
                  </p><p>
                     This field is optional.  If it doesn't exist, then
                     <code class="literal">N</code> will be assumed.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean (<code class="literal">Y</code> or <code class="literal">N</code>).
                  </p></dd><dt><span class="term"><code class="literal">refresh_media_delay</code></span></dt><dd><p>Number of seconds to delay after refreshing media</p><p>
                     This field is optional.  If it doesn't exist, no delay
                     will occur.
                  </p><p>
                     Some devices seem to take a little while to stablize after
                     refreshing the media (i.e. closing and opening the tray).
                     During this period, operations on the media may fail.  If
                     your device behaves like this, you can try setting a delay
                     of 10-15 seconds.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> If set, must be an integer &#8805; 1.
                  </p></dd><dt><span class="term"><code class="literal">eject_delay</code></span></dt><dd><p>Number of seconds to delay after ejecting the tray</p><p>
                     This field is optional.  If it doesn't exist, no delay
                     will occur.
                  </p><p>
                     If your system seems to have problems opening and closing the tray,
                     one possibility is that the open/close sequence is happening too
                     quickly &#8212; either the tray isn't fully open when Cedar Backup
                     tries to close it, or it doesn't report being open.  To work around
                     that problem, set an eject delay of a few seconds.
                  </p><p>
                     <span class="emphasis"><em>Restrictions:</em></span> If set, must be an integer &#8805; 1.
                  </p></dd><dt><span class="term"><code class="literal">blank_behavior</code></span></dt><dd><p>Optimized blanking strategy.</p><p>
                     For more information about Cedar Backup's optimized
                     blanking strategy, see <a class="xref" href="#cedar-config-blanking" title="Optimized Blanking Stategy">the section called &#8220;Optimized Blanking Stategy&#8221;</a>.
                  </p><p>
                     This entire configuration section is optional.  However,
                     if you choose to provide it, you must configure both a
                     blanking mode and a blanking factor.
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">blank_mode</code></span></dt><dd><p>Blanking mode.</p><p>
                              <span class="emphasis"><em>Restrictions:</em></span>Must be one of "daily" or "weekly".
                           </p></dd></dl></div><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">blank_factor</code></span></dt><dd><p>Blanking factor.</p><p>
                              <span class="emphasis"><em>Restrictions:</em></span>Must be a floating point number &#8805; 0.
                           </p></dd></dl></div></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-purge"></a>Purge Configuration</h3></div></div></div><p>
            The purge configuration section contains configuration options
            related the the purge action.  This section contains a set of
            directories to be purged, along with information about the schedule
            at which they should be purged.
         </p><p>
            Typically, Cedar Backup should be configured to purge collect
            directories daily (retain days of <code class="literal">0</code>).
         </p><p>
            If you are tight on space, staging directories can also be purged
            daily.  However, if you have space to spare, you should consider
            purging about once per week.  That way, if your backup media is
            damaged, you will be able to recreate the week's backup using the
            rebuild action.
         </p><p>
            You should also purge the working directory periodically, once
            every few weeks or once per month.  This way, if any unneeded files
            are left around, perhaps because a backup was interrupted or
            because configuration changed, they will eventually be removed.
            <span class="emphasis"><em>The working directory should not be purged any more
            frequently than once per week, otherwise you will risk destroying
            data used for incremental backups.</em></span>
         </p><p>
            This is an example purge configuration section:
         </p><pre class="programlisting">
&lt;purge&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/opt/backup/stage&lt;/abs_path&gt;
      &lt;retain_days&gt;7&lt;/retain_days&gt;
   &lt;/dir&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/opt/backup/collect&lt;/abs_path&gt;
      &lt;retain_days&gt;0&lt;/retain_days&gt;
   &lt;/dir&gt;
&lt;/purge&gt;
         </pre><p>
            The following elements are part of the purge configuration section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">dir</code></span></dt><dd><p>A directory to purge within.</p><p>
                     This is a subsection which contains information about
                     a specific directory to purge within.
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  At least one purge directory must be
                     configured.
                  </p><p>
                     The purge directory subsection contains the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                              Absolute path of the directory to purge within.
                           </p><p>
                              The contents of the directory will be purged
                              based on age.  The purge will remove any files
                              that were last modified more than <span class="quote">&#8220;<span class="quote">retain
                              days</span>&#8221;</span> days ago.  Empty directories will
                              also eventually be removed.  The purge directory
                              itself will never be removed.
                           </p><p>
                              The path may be either a directory, a soft link
                              to a directory, or a hard link to a directory.
                              Soft links <span class="emphasis"><em>within</em></span> the
                              directory (if any) are treated as files.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                           </p></dd><dt><span class="term"><code class="literal">retain_days</code></span></dt><dd><p>
                              Number of days to retain old files.
                           </p><p>
                              Once it has been more than this many days since a file
                              was last modified, it is a candidate for removal.
                           </p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an integer &#8805; 0.
                           </p></dd></dl></div></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-config-configfile-extensions"></a>Extensions Configuration</h3></div></div></div><p>
            The extensions configuration section is used to configure
            third-party extensions to Cedar Backup.  If you don't intend to use
            any extensions, or don't know what extensions are, then you can
            safely leave this section out of your configuration file.  It is
            optional.
         </p><p>
            Extensions configuration is used to specify <span class="quote">&#8220;<span class="quote">extended
            actions</span>&#8221;</span> implemented by code external to Cedar Backup.  An
            administrator can use this section to map command-line Cedar
            Backup actions to third-party extension functions.
         </p><p>
            Each extended action has a name, which is mapped to a Python
            function within a particular module.  Each action also has an index
            associated with it.  This index is used to properly order execution
            when more than one action is specified on the command line.
            The standard actions have predefined indexes, and extended actions
            are interleaved into the normal order of execution using those
            indexes.  The collect action has index 100, the stage index
            has action 200, the store action has index 300 and the purge
            action has index 400.
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Extended actions should always be configured to run
               <span class="emphasis"><em>before</em></span> the standard action they are
               associated with.  This is because of the way indicator files are
               used in Cedar Backup.  For instance, the staging process
               considers the collect action to be complete for a peer if the
               file <code class="filename">cback.collect</code> can be found in that
               peer's collect directory.  
            </p><p>
               If you were to run the standard collect action before your other
               collect-like actions, the indicator file would be written after
               the collect action completes but <span class="emphasis"><em>before</em></span> all
               of the other actions even run.  Because of this, there's a
               chance the stage process might back up the collect directory
               before the entire set of collect-like actions have completed
               &#8212; and you would get no warning about this in your email!
            </p></div><p>
            So, imagine that a third-party developer provided a Cedar
            Backup extension to back up a certain kind of database repository,
            and you wanted to map that extension to the <span class="quote">&#8220;<span class="quote">database</span>&#8221;</span>
            command-line action.  You have been told that this function is
            called <span class="quote">&#8220;<span class="quote">foo.bar()</span>&#8221;</span>.  You think of this backup as a
            <span class="quote">&#8220;<span class="quote">collect</span>&#8221;</span> kind of action, so you want it to be
            performed immediately before the collect action.
         </p><p>
            To configure this extension, you would list an action with a name
            <span class="quote">&#8220;<span class="quote">database</span>&#8221;</span>, a module <span class="quote">&#8220;<span class="quote">foo</span>&#8221;</span>,
            a function name <span class="quote">&#8220;<span class="quote">bar</span>&#8221;</span> and an index of
            <span class="quote">&#8220;<span class="quote">99</span>&#8221;</span>.
         </p><p>
            This is how the hypothetical action would be configured:
         </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;database&lt;/name&gt;
      &lt;module&gt;foo&lt;/module&gt;
      &lt;function&gt;bar&lt;/function&gt;
      &lt;index&gt;99&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
         </pre><p>
            The following elements are part of the extensions configuration
            section:
         </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">action</code></span></dt><dd><p>
                     This is a subsection that contains configuration
                     related to a single extended action.
                  </p><p>
                     This section can be repeated as many times as is
                     necessary.  
                  </p><p>
                     The action subsection contains the following fields:
                  </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">name</code></span></dt><dd><p>Name of the extended action.</p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a non-empty
                              string consisting of only lower-case letters and digits.
                           </p></dd><dt><span class="term"><code class="literal">module</code></span></dt><dd><p>Name of the Python module associated with the extension function.</p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a non-empty string
                              and a valid Python identifier.
                           </p></dd><dt><span class="term"><code class="literal">function</code></span></dt><dd><p>Name of the Python extension function within the module.</p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be a non-empty string
                              and a valid Python identifier.
                           </p></dd><dt><span class="term"><code class="literal">index</code></span></dt><dd><p>Index of action, for execution ordering.</p><p>
                              <span class="emphasis"><em>Restrictions:</em></span> Must be an integer &#8805; 0.
                           </p></dd></dl></div></dd></dl></div></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-poolofone"></a>Setting up a Pool of One</h2></div></div></div><p>
         Cedar Backup has been designed primarily for situations where there is
         a single master and a set of other clients that the master interacts
         with.  However, it will just as easily work for a single machine (a
         backup pool of one).
      </p><p>
         Once you complete all of these configuration steps, your backups will
         run as scheduled out of cron. Any errors that occur will be reported
         in daily emails to your root user (or the user that receives root's
         email). If you don't receive any emails, then you know your backup
         worked.
      </p><p>
         Note: all of these configuration steps should be run as the root user,
         unless otherwise indicated.
      </p><div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Tip</h3><p>
            This setup procedure discusses how to set up Cedar Backup in the
            <span class="quote">&#8220;<span class="quote">normal case</span>&#8221;</span> for a pool of one.  If you would like to
            modify the way Cedar Backup works (for instance, by ignoring the
            store stage and just letting your backup sit in a staging
            directory), you can do that.  You'll just have to modify the
            procedure below based on information in the remainder of the
            manual.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60200128"></a>Step 1: Decide when you will run your backup.</h3></div></div></div><p>
            There are four parts to a Cedar Backup run: collect, stage, store
            and purge. The usual way of setting off these steps is through a
            set of cron jobs.  Although you won't create your cron jobs just
            yet, you should decide now when you will run your backup so you are
            prepared for later.
         </p><p>
            Backing up large directories and creating ISO filesystem images can
            be intensive operations, and could slow your computer down
            significantly. Choose a backup time that will not interfere with
            normal use of your computer.  Usually, you will want the backup to
            occur every day, but it is possible to configure cron to execute
            the backup only one day per week, three days per week, etc.
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Because of the way Cedar Backup works, you must ensure that your
               backup <span class="emphasis"><em>always</em></span> runs on the first day of your
               configured week.  This is because Cedar Backup will only clear
               incremental backup information and re-initialize your media when
               running on the first day of the week.  If you skip running Cedar
               Backup on the first day of the week, your backups will likely be
               <span class="quote">&#8220;<span class="quote">confused</span>&#8221;</span> until the next week begins, or until you
               re-run the backup using the <code class="option">--full</code> flag.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60205360"></a>Step 2: Make sure email works.</h3></div></div></div><p>
            Cedar Backup relies on email for problem notification.  This
            notification works through the magic of cron.  Cron will email any
            output from each job it executes to the user associated with the
            job.  Since by default Cedar Backup only writes output to the
            terminal if errors occur, this ensures that notification emails
            will only be sent out if errors occur.
         </p><p>
            In order to receive problem notifications, you must make sure that
            email works for the user which is running the Cedar Backup cron
            jobs (typically root).  Refer to your distribution's documentation
            for information on how to configure email on your system.  Note
            that you may prefer to configure root's email to forward to some
            other user, so you do not need to check the root user's mail in
            order to see Cedar Backup errors.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60208656"></a>Step 3: Configure your writer device.</h3></div></div></div><p>
            Before using Cedar Backup, your writer device must be properly
            configured.  If you have configured your CD/DVD writer hardware to
            work through the normal filesystem device path, then you just need
            to know the path to the device on disk (something like
            <code class="filename">/dev/cdrw</code>).  Cedar Backup will use the this
            device path both when talking to a command like
            <span class="command"><strong>cdrecord</strong></span> and when doing filesystem operations
            like running media validation.
         </p><p>
            Your other option is to configure your CD writer hardware like a SCSI
            device (either because it <span class="emphasis"><em>is</em></span> a SCSI device or
            because you are using some sort of interface that makes it look
            like one).  In this case, Cedar Backup will use the SCSI id when
            talking to <span class="command"><strong>cdrecord</strong></span> and the device path when
            running filesystem operations.
         </p><p>
            See <a class="xref" href="#cedar-config-writer" title="Configuring your Writer Device">the section called &#8220;Configuring your Writer Device&#8221;</a> for more information on
            writer devices and how they are configured.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               There is no need to set up your CD/DVD device if you have
               decided not to execute the store action.
            </p><p>
                Due to the underlying utilities that Cedar Backup uses, the
                SCSI id may only be used for CD writers,
                <span class="emphasis"><em>not</em></span> DVD writers.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60216656"></a>Step 4: Configure your backup user.</h3></div></div></div><p>
             Choose a user to be used for backups. Some platforms may
             come with a <span class="quote">&#8220;<span class="quote">ready made</span>&#8221;</span> backup user. For other
             platforms, you may have to create a user yourself. You may
             choose any id you like, but a descriptive name such as
             <code class="literal">backup</code> or <code class="literal">cback</code> is a good
             choice.  See your distribution's documentation for information on
             how to add a user.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               Standard Debian systems come with a user named
               <code class="literal">backup</code>.  You may choose to stay with this
               user or create another one.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60221824"></a>Step 5: Create your backup tree.</h3></div></div></div><p>
            Cedar Backup requires a backup directory tree on disk. This
            directory tree must be roughly three times as big as the amount of
            data that will be backed up on a nightly basis, to allow for the
            data to be collected, staged, and then placed into an ISO filesystem
            image on disk. (This is one disadvantage to using Cedar Backup in
            single-machine pools, but in this day of really large hard drives,
            it might not be an issue.) Note that if you elect not to purge the
            staging directory every night, you will need even more space.
         </p><p>
            You should create a collect directory, a staging directory and a
            working (temporary) directory. One recommended layout is this:
         </p><pre class="programlisting">
/opt/
     backup/
            collect/
            stage/
            tmp/
         </pre><p>
            If you will be backing up sensitive information (i.e. password
            files), it is recommended that these directories be owned by the
            backup user (whatever you named it), with permissions
            <code class="literal">700</code>. 
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               You don't have to use <code class="filename">/opt</code> as the root of your
               directory structure.  Use anything you would like.  I use
               <code class="filename">/opt</code> because it is my <span class="quote">&#8220;<span class="quote">dumping
               ground</span>&#8221;</span> for filesystems that Debian does not manage.
            </p><p>
               Some users have requested that the Debian packages set up a more
               <span class="quote">&#8220;<span class="quote">standard</span>&#8221;</span> location for backups right
               out-of-the-box.  I have resisted doing this because it's
               difficult to choose an appropriate backup location from within
               the package.  If you would prefer, you can create the backup
               directory structure within some existing Debian directory such
               as <code class="filename">/var/backups</code> or
               <code class="filename">/var/tmp</code>.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60231984"></a>Step 6: Create the Cedar Backup configuration file.</h3></div></div></div><p>
            Following the instructions in <a class="xref" href="#cedar-config-configfile" title="Configuration File Format">the section called &#8220;Configuration File Format&#8221;</a> (above) create a configuration
            file for your machine.  Since you are working with a pool of one,
            you must configure all four action-specific sections: collect,
            stage, store and purge.
         </p><p>
            The usual location for the Cedar Backup config file is
            <code class="filename">/etc/cback.conf</code>.  If you change the location,
            make sure you edit your cronjobs (below) to point the
            <span class="command"><strong>cback</strong></span> script at the correct config file (using
            the <code class="option">--config</code> option).
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Configuration files should always be writable only by root
               (or by the file owner, if the owner is not root).
            </p><p>
               If you intend to place confidential information into the Cedar
               Backup configuration file, make sure that you set the filesystem
               permissions on the file appropriately.  For instance, if you
               configure any extensions that require passwords or other similar
               information, you should make the file readable only to root or
               to the file owner (if the owner is not root).
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60238336"></a>Step 7: Validate the Cedar Backup configuration file.</h3></div></div></div><p>
            Use the command <span class="command"><strong>cback validate</strong></span> to validate your
            configuration file. This command checks that the configuration file
            can be found and parsed, and also checks for typical configuration
            problems, such as invalid CD/DVD device entries.
         </p><p>
            Note: the most common cause of configuration problems is in not
            closing XML tags properly. Any XML tag that is
            <span class="quote">&#8220;<span class="quote">opened</span>&#8221;</span> must be <span class="quote">&#8220;<span class="quote">closed</span>&#8221;</span> appropriately.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60242016"></a>Step 8: Test your backup.</h3></div></div></div><p>
            Place a valid CD/DVD disc in your drive, and then use the
            command <span class="command"><strong>cback --full all</strong></span>.  You should execute
            this command as root.  If the command completes with no output,
            then the backup was run successfully.
         </p><p>
            Just to be sure that everything worked properly, check the logfile
            (<code class="filename">/var/log/cback.log</code>) for errors and also mount
            the CD/DVD disc to be sure it can be read.
         </p><p>
            <span class="emphasis"><em>If Cedar Backup ever completes <span class="quote">&#8220;<span class="quote">normally</span>&#8221;</span>
            but the disc that is created is not usable, please report this as a
            bug.
            <a href="#ftn.cedar-config-foot-bugs" class="footnote" name="cedar-config-foot-bugs"><sup class="footnote">[22]</sup></a>
            To be safe, always enable the consistency check option in the
            store configuration section.</em></span>
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60248384"></a>Step 9: Modify the backup cron jobs.</h3></div></div></div><p>
            Since Cedar Backup should be run as root, one way to configure the
            cron job is to add a line like this to your
            <code class="filename">/etc/crontab</code> file:
         </p><pre class="programlisting">
30 00 * * * root  cback all
         </pre><p>
            Or, you can create an executable script containing just these lines
            and place that file in the <code class="filename">/etc/cron.daily</code>
            directory:
         </p><pre class="programlisting">
#/bin/sh
cback all
         </pre><p>
            You should consider adding the <code class="option">--output</code> or
            <code class="option">-O</code> switch to your <span class="command"><strong>cback</strong></span>
            command-line in cron.  This will result in larger logs, but could
            help diagnose problems when commands like
            <span class="command"><strong>cdrecord</strong></span> or <span class="command"><strong>mkisofs</strong></span> fail
            mysteriously.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               For general information about using cron, see the manpage for
               crontab(5).
            </p><p>
               On a Debian system, execution of daily backups is controlled by
               the file <code class="filename">/etc/cron.d/cedar-backup2</code>.  As
               installed, this file contains several different settings, all
               commented out.  Uncomment the <span class="quote">&#8220;<span class="quote">Single machine (pool of
               one)</span>&#8221;</span> entry in the file, and change the line so that the
               backup goes off when you want it to.
            </p></div></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-client"></a>Setting up a Client Peer Node</h2></div></div></div><p>
         Cedar Backup has been designed to backup entire <span class="quote">&#8220;<span class="quote">pools</span>&#8221;</span>
         of machines.  In any given pool, there is one master and some number
         of clients.  Most of the work takes place on the master, so
         configuring a client is a little simpler than configuring a master.
      </p><p>
         Backups are designed to take place over an RSH or SSH connection.
         Because RSH is generally considered insecure, you are encouraged to
         use SSH rather than RSH. This document will only describe how to
         configure Cedar Backup to use SSH; if you want to use RSH, you're on
         your own. 
      </p><p>
         Once you complete all of these configuration steps, your backups will
         run as scheduled out of cron. Any errors that occur will be reported
         in daily emails to your root user (or the user that receives root's
         email). If you don't receive any emails, then you know your backup
         worked.
      </p><p>
         Note: all of these configuration steps should be run as the root user,
         unless otherwise indicated.
      </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
            See <a class="xref" href="#cedar-securingssh" title="Appendix D. Securing Password-less SSH Connections">Appendix D, <i>Securing Password-less SSH Connections</i></a> for some important notes on
            how to optionally further secure password-less SSH connections to
            your clients.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60264304"></a>Step 1: Decide when you will run your backup.</h3></div></div></div><p>
            There are four parts to a Cedar Backup run: collect, stage, store
            and purge. The usual way of setting off these steps is through a
            set of cron jobs.  Although you won't create your cron jobs just
            yet, you should decide now when you will run your backup so you are
            prepared for later.
         </p><p>
            Backing up large directories and creating ISO filesystem images can be
            intensive operations, and could slow your computer down
            significantly. Choose a backup time that will not interfere with
            normal use of your computer.  Usually, you will want the backup to
            occur every day, but it is possible to configure cron to execute
            the backup only one day per week, three days per week, etc.
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Because of the way Cedar Backup works, you must ensure that your
               backup <span class="emphasis"><em>always</em></span> runs on the first day of your
               configured week.  This is because Cedar Backup will only clear
               incremental backup information and re-initialize your media when
               running on the first day of the week.  If you skip running Cedar
               Backup on the first day of the week, your backups will likely be
               <span class="quote">&#8220;<span class="quote">confused</span>&#8221;</span> until the next week begins, or until you
               re-run the backup using the <code class="option">--full</code> flag.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60269536"></a>Step 2: Make sure email works.</h3></div></div></div><p>
            Cedar Backup relies on email for problem notification.  This
            notification works through the magic of cron.  Cron will email any
            output from each job it executes to the user associated with the
            job.  Since by default Cedar Backup only writes output to the
            terminal if errors occur, this neatly ensures that notification
            emails will only be sent out if errors occur.
         </p><p>
            In order to receive problem notifications, you must make sure that
            email works for the user which is running the Cedar Backup cron
            jobs (typically root).  Refer to your distribution's documentation
            for information on how to configure email on your system.  Note
            that you may prefer to configure root's email to forward to some
            other user, so you do not need to check the root user's mail in
            order to see Cedar Backup errors.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp59575344"></a>Step 3: Configure the master in your backup pool.</h3></div></div></div><p>
            You will not be able to complete the client configuration until at
            least step 3 of the master's configuration has been completed. In
            particular, you will need to know the master's public SSH identity
            to fully configure a client.
         </p><p>
            To find the master's public SSH identity, log in as the backup
            user on the master and <span class="command"><strong>cat</strong></span> the public identity
            file <code class="filename">~/.ssh/id_rsa.pub</code>:
         </p><pre class="programlisting">
user@machine&gt; cat ~/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEA0vOKjlfwohPg1oPRdrmwHk75l3mI9Tb/WRZfVnu2Pw69
uyphM9wBLRo6QfOC2T8vZCB8o/ZIgtAM3tkM0UgQHxKBXAZ+H36TOgg7BcI20I93iGtzpsMA/uXQy8kH
HgZooYqQ9pw+ZduXgmPcAAv2b5eTm07wRqFt/U84k6bhTzs= user@machine
         </pre></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp59579776"></a>Step 4: Configure your backup user.</h3></div></div></div><p>
             Choose a user to be used for backups. Some platforms may come with
             a "ready made" backup user. For other platforms, you may have to
             create a user yourself. You may choose any id you like, but a
             descriptive name such as <code class="literal">backup</code> or
             <code class="literal">cback</code> is a good choice.  See your
             distribution's documentation for information on how to add a user.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               Standard Debian systems come with a user named
               <code class="literal">backup</code>.  You may choose to stay with this
               user or create another one.
            </p></div><p>
             Once you have created your backup user, you must create an SSH
             keypair for it. Log in as your backup user, and then run the
             command <span class="command"><strong>ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa</strong></span>:
         </p><pre class="programlisting">
user@machine&gt; ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Created directory '/home/user/.ssh'.
Your identification has been saved in /home/user/.ssh/id_rsa.
Your public key has been saved in /home/user/.ssh/id_rsa.pub.
The key fingerprint is:
11:3e:ad:72:95:fe:96:dc:1e:3b:f4:cc:2c:ff:15:9e user@machine
         </pre><p>
            The default permissions for this directory should be fine.
            However, if the directory existed before you ran
            <span class="command"><strong>ssh-keygen</strong></span>, then you may need to modify the
            permissions.  Make sure that the <code class="filename">~/.ssh</code>
            directory is readable only by the backup user (i.e. mode
            <code class="literal">700</code>), that the
            <code class="filename">~/.ssh/id_rsa</code> file is only readable and
            writable only by the backup user (i.e. mode <code class="literal">600</code>)
            and that the <code class="filename">~/.ssh/id_rsa.pub</code> file is
            writable only by the backup user (i.e. mode <code class="literal">600</code>
            or mode <code class="literal">644</code>).
         </p><p>
            Finally, take the master's public SSH identity (which you found in
            step 2) and cut-and-paste it into the file
            <code class="filename">~/.ssh/authorized_keys</code>.  Make sure the
            identity value is pasted into the file <span class="emphasis"><em>all on one
            line</em></span>, and that the <code class="filename">authorized_keys</code>
            file is owned by your backup user and has permissions
            <code class="literal">600</code>.
         </p><p>
            If you have other preferences or standard ways of setting up your
            users' SSH configuration (i.e. different key type, etc.), feel free
            to do things your way.  The important part is that the master must
            be able to SSH into a client <span class="emphasis"><em>with no password entry
            required</em></span>.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60301904"></a>Step 5: Create your backup tree.</h3></div></div></div><p>
            Cedar Backup requires a backup directory tree on disk. This
            directory tree must be roughly as big as the amount of data that
            will be backed up on a nightly basis (more if you elect not to
            purge it all every night).
         </p><p>
            You should create a collect directory and a working (temporary)
            directory. One recommended layout is this:
         </p><pre class="programlisting">
/opt/
     backup/
            collect/
            tmp/
         </pre><p>
            If you will be backing up sensitive information (i.e. password
            files), it is recommended that these directories be owned by the
            backup user (whatever you named it), with permissions
            <code class="literal">700</code>. 
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               You don't have to use <code class="filename">/opt</code> as the root of your
               directory structure.  Use anything you would like.  I use
               <code class="filename">/opt</code> because it is my <span class="quote">&#8220;<span class="quote">dumping
               ground</span>&#8221;</span> for filesystems that Debian does not manage.
            </p><p>
               Some users have requested that the Debian packages set up a more
               "standard" location for backups right out-of-the-box.  I have
               resisted doing this because it's difficult to choose an
               appropriate backup location from within the package.  If you
               would prefer, you can create the backup directory structure
               within some existing Debian directory such as
               <code class="filename">/var/backups</code> or
               <code class="filename">/var/tmp</code>.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60310656"></a>Step 6: Create the Cedar Backup configuration file.</h3></div></div></div><p>
            Following the instructions in <a class="xref" href="#cedar-config-configfile" title="Configuration File Format">the section called &#8220;Configuration File Format&#8221;</a> (above), create a configuration
            file for your machine.  Since you are working with a client, you
            must configure all action-specific sections for the collect and
            purge actions.
         </p><p>
            The usual location for the Cedar Backup config file is
            <code class="filename">/etc/cback.conf</code>.  If you change the location,
            make sure you edit your cronjobs (below) to point the
            <span class="command"><strong>cback</strong></span> script at the correct config file (using
            the <code class="option">--config</code> option).
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Configuration files should always be writable only by root
               (or by the file owner, if the owner is not root).
            </p><p>
               If you intend to place confidental information into the Cedar
               Backup configuration file, make sure that you set the filesystem
               permissions on the file appropriately.  For instance, if you
               configure any extensions that require passwords or other similar
               information, you should make the file readable only to root or
               to the file owner (if the owner is not root).
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60317008"></a>Step 7: Validate the Cedar Backup configuration file.</h3></div></div></div><p>
            Use the command <span class="command"><strong>cback validate</strong></span> to validate your
            configuration file. This command checks that the configuration file
            can be found and parsed, and also checks for typical configuration
            problems.  This command <span class="emphasis"><em>only</em></span> validates
            configuration on the one client, not the master or any other
            clients in a pool.
         </p><p>
            Note: the most common cause of configuration problems is in not
            closing XML tags properly. Any XML tag that is
            <span class="quote">&#8220;<span class="quote">opened</span>&#8221;</span> must be <span class="quote">&#8220;<span class="quote">closed</span>&#8221;</span> appropriately.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60321184"></a>Step 8: Test your backup.</h3></div></div></div><p>
            Use the command <span class="command"><strong>cback --full collect purge</strong></span>.  If the 
            command completes with no output, then the backup was run successfully.
            Just to be sure that everything worked properly, check the logfile 
            (<code class="filename">/var/log/cback.log</code>) for errors.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60324128"></a>Step 9: Modify the backup cron jobs.</h3></div></div></div><p>
            Since Cedar Backup should be run as root, you should add a set of
            lines like this to your <code class="filename">/etc/crontab</code> file:
         </p><pre class="programlisting">
30 00 * * * root  cback collect
30 06 * * * root  cback purge
         </pre><p>
            You should consider adding the <code class="option">--output</code> or
            <code class="option">-O</code> switch to your <span class="command"><strong>cback</strong></span>
            command-line in cron.  This will result in larger logs, but could
            help diagnose problems when commands like
            <span class="command"><strong>cdrecord</strong></span> or <span class="command"><strong>mkisofs</strong></span> fail
            mysteriously.
         </p><p>
            You will need to coordinate the collect and purge actions on the
            client so that the collect action completes before the master
            attempts to stage, and so that the purge action does not begin
            until after the master has completed staging.  Usually, allowing an
            hour or two between steps should be sufficient.  <a href="#ftn.cedar-config-foot-coordinate" class="footnote" name="cedar-config-foot-coordinate"><sup class="footnote">[23]</sup></a>
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               For general information about using cron, see the manpage for
               crontab(5).
            </p><p>
               On a Debian system, execution of daily backups is controlled by
               the file <code class="filename">/etc/cron.d/cedar-backup2</code>.  As
               installed, this file contains several different settings, all
               commented out.  Uncomment the <span class="quote">&#8220;<span class="quote">Client machine</span>&#8221;</span>
               entries in the file, and change the lines so that the backup
               goes off when you want it to.
            </p></div></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-master"></a>Setting up a Master Peer Node</h2></div></div></div><p>
         Cedar Backup has been designed to backup entire <span class="quote">&#8220;<span class="quote">pools</span>&#8221;</span>
         of machines.  In any given pool, there is one master and some number
         of clients.  Most of the work takes place on the master, so
         configuring a master is somewhat more complicated than configuring a
         client.
      </p><p>
         Backups are designed to take place over an RSH or SSH connection.
         Because RSH is generally considered insecure, you are encouraged to
         use SSH rather than RSH. This document will only describe how to
         configure Cedar Backup to use SSH; if you want to use RSH, you're on
         your own. 
      </p><p>
         Once you complete all of these configuration steps, your backups will
         run as scheduled out of cron. Any errors that occur will be reported
         in daily emails to your root user (or whichever other user receives
         root's email). If you don't receive any emails, then you know your
         backup worked.
      </p><p>
         Note: all of these configuration steps should be run as the root user,
         unless otherwise indicated.
      </p><div class="tip" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Tip</h3><p>
            This setup procedure discusses how to set up Cedar Backup in the
            <span class="quote">&#8220;<span class="quote">normal case</span>&#8221;</span> for a master.  If you would like to
            modify the way Cedar Backup works (for instance, by ignoring the
            store stage and just letting your backup sit in a staging
            directory), you can do that.  You'll just have to modify the
            procedure below based on information in the remainder of the
            manual.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60341344"></a>Step 1: Decide when you will run your backup.</h3></div></div></div><p>
            There are four parts to a Cedar Backup run: collect, stage, store
            and purge. The usual way of setting off these steps is through a
            set of cron jobs.  Although you won't create your cron jobs just
            yet, you should decide now when you will run your backup so you are
            prepared for later.
         </p><p>
            Keep in mind that you do not necessarily have to run the collect
            action on the master.  See notes further below for more
            information.
         </p><p>
            Backing up large directories and creating ISO filesystem images can be
            intensive operations, and could slow your computer down
            significantly. Choose a backup time that will not interfere with
            normal use of your computer.  Usually, you will want the backup to
            occur every day, but it is possible to configure cron to execute
            the backup only one day per week, three days per week, etc.
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Because of the way Cedar Backup works, you must ensure that your
               backup <span class="emphasis"><em>always</em></span> runs on the first day of your
               configured week.  This is because Cedar Backup will only clear
               incremental backup information and re-initialize your media when
               running on the first day of the week.  If you skip running Cedar
               Backup on the first day of the week, your backups will likely be
               <span class="quote">&#8220;<span class="quote">confused</span>&#8221;</span> until the next week begins, or until you
               re-run the backup using the <code class="option">--full</code> flag.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60347152"></a>Step 2: Make sure email works.</h3></div></div></div><p>
            Cedar Backup relies on email for problem notification.  This
            notification works through the magic of cron.  Cron will email any
            output from each job it executes to the user associated with the
            job.  Since by default Cedar Backup only writes output to the
            terminal if errors occur, this neatly ensures that notification
            emails will only be sent out if errors occur.
         </p><p>
            In order to receive problem notifications, you must make sure that
            email works for the user which is running the Cedar Backup cron
            jobs (typically root).  Refer to your distribution's documentation
            for information on how to configure email on your system.  Note
            that you may prefer to configure root's email to forward to some
            other user, so you do not need to check the root user's mail in
            order to see Cedar Backup errors.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60350016"></a>Step 3: Configure your writer device.</h3></div></div></div><p>
            Before using Cedar Backup, your writer device must be properly
            configured.  If you have configured your CD/DVD writer hardware to
            work through the normal filesystem device path, then you just need
            to know the path to the device on disk (something like
            <code class="filename">/dev/cdrw</code>).  Cedar Backup will use the this
            device path both when talking to a command like
            <span class="command"><strong>cdrecord</strong></span> and when doing filesystem operations
            like running media validation.
         </p><p>
            Your other option is to configure your CD writer hardware like a SCSI
            device (either because it <span class="emphasis"><em>is</em></span> a SCSI device or
            because you are using some sort of interface that makes it look
            like one).  In this case, Cedar Backup will use the SCSI id when
            talking to <span class="command"><strong>cdrecord</strong></span> and the device path when
            running filesystem operations.
         </p><p>
            See <a class="xref" href="#cedar-config-writer" title="Configuring your Writer Device">the section called &#8220;Configuring your Writer Device&#8221;</a> for more information on
            writer devices and how they are configured.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               There is no need to set up your CD/DVD device if you have
               decided not to execute the store action.
            </p><p>
                Due to the underlying utilities that Cedar Backup uses, the
                SCSI id may only be used for CD writers,
                <span class="emphasis"><em>not</em></span> DVD writers.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60358016"></a>Step 4: Configure your backup user.</h3></div></div></div><p>
             Choose a user to be used for backups. Some platforms may come with
             a <span class="quote">&#8220;<span class="quote">ready made</span>&#8221;</span> backup user. For other platforms, you
             may have to create a user yourself. You may choose any id you
             like, but a descriptive name such as <code class="literal">backup</code> or
             <code class="literal">cback</code> is a good choice.  See your
             distribution's documentation for information on how to add a user.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               Standard Debian systems come with a user named
               <code class="literal">backup</code>.  You may choose to stay with this
               user or create another one.
            </p></div><p>
             Once you have created your backup user, you must create an SSH
             keypair for it. Log in as your backup user, and then run the
             command <span class="command"><strong>ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa</strong></span>:
         </p><pre class="programlisting">
user@machine&gt; ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Created directory '/home/user/.ssh'.
Your identification has been saved in /home/user/.ssh/id_rsa.
Your public key has been saved in /home/user/.ssh/id_rsa.pub.
The key fingerprint is:
11:3e:ad:72:95:fe:96:dc:1e:3b:f4:cc:2c:ff:15:9e user@machine
         </pre><p>
            The default permissions for this directory should be fine.
            However, if the directory existed before you ran
            <span class="command"><strong>ssh-keygen</strong></span>, then you may need to modify the
            permissions.  Make sure that the <code class="filename">~/.ssh</code>
            directory is readable only by the backup user (i.e. mode
            <code class="literal">700</code>), that the
            <code class="filename">~/.ssh/id_rsa</code> file is only readable and
            writable by the backup user (i.e. mode <code class="literal">600</code>) and
            that the <code class="filename">~/.ssh/id_rsa.pub</code> file is writable
            only by the backup user (i.e. mode <code class="literal">600</code> or mode
            <code class="literal">644</code>).
         </p><p>
            If you have other preferences or standard ways of setting up your
            users' SSH configuration (i.e. different key type, etc.), feel free
            to do things your way.  The important part is that the master must
            be able to SSH into a client <span class="emphasis"><em>with no password entry
            required</em></span>.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60372064"></a>Step 5: Create your backup tree.</h3></div></div></div><p>
            Cedar Backup requires a backup directory tree on disk. This
            directory tree must be roughly large enough hold twice as much data
            as will be backed up from the entire pool on a given night, plus
            space for whatever is collected on the master itself. This will
            allow for all three operations - collect, stage and store - to have
            enough space to complete. Note that if you elect not to purge the
            staging directory every night, you will need even more space.
         </p><p>
            You should create a collect directory, a staging directory and a
            working (temporary) directory. One recommended layout is this:
         </p><pre class="programlisting">
/opt/
     backup/
            collect/
            stage/
            tmp/
         </pre><p>
            If you will be backing up sensitive information (i.e. password
            files), it is recommended that these directories be owned by the
            backup user (whatever you named it), with permissions
            <code class="literal">700</code>. 
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               You don't have to use <code class="filename">/opt</code> as the root of your
               directory structure.  Use anything you would like.  I use
               <code class="filename">/opt</code> because it is my <span class="quote">&#8220;<span class="quote">dumping
               ground</span>&#8221;</span> for filesystems that Debian does not manage.
            </p><p>
               Some users have requested that the Debian packages set up a more
               <span class="quote">&#8220;<span class="quote">standard</span>&#8221;</span> location for backups right
               out-of-the-box.  I have resisted doing this because it's
               difficult to choose an appropriate backup location from within
               the package.  If you would prefer, you can create the backup
               directory structure within some existing Debian directory such
               as <code class="filename">/var/backups</code> or
               <code class="filename">/var/tmp</code>.
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60381504"></a>Step 6: Create the Cedar Backup configuration file.</h3></div></div></div><p>
            Following the instructions in 
            <a class="xref" href="#cedar-config-configfile" title="Configuration File Format">the section called &#8220;Configuration File Format&#8221;</a> (above), create a
            configuration file for your machine.  Since you are working with a
            master machine, you would typically configure all four
            action-specific sections: collect, stage, store and purge.
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               Note that the master can treat itself as a <span class="quote">&#8220;<span class="quote">client</span>&#8221;</span>
               peer for certain actions.  As an example, if you run the collect
               action on the master, then you will stage that data by
               configuring a local peer representing the master.
            </p><p>
               Something else to keep in mind is that you do not really have to
               run the collect action on the master.  For instance, you may
               prefer to just use your master machine as a <span class="quote">&#8220;<span class="quote">consolidation
               point</span>&#8221;</span> machine that just collects data from the other
               client machines in a backup pool.  In that case, there is no
               need to collect data on the master itself.
            </p></div><p>
            The usual location for the Cedar Backup config file is
            <code class="filename">/etc/cback.conf</code>.  If you change the location,
            make sure you edit your cronjobs (below) to point the
            <span class="command"><strong>cback</strong></span> script at the correct config file (using
            the <code class="option">--config</code> option).
         </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
               Configuration files should always be writable only by root
               (or by the file owner, if the owner is not root).
            </p><p>
               If you intend to place confidental information into the Cedar
               Backup configuration file, make sure that you set the filesystem
               permissions on the file appropriately.  For instance, if you
               configure any extensions that require passwords or other similar
               information, you should make the file readable only to root or
               to the file owner (if the owner is not root).
            </p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60390640"></a>Step 7: Validate the Cedar Backup configuration file.</h3></div></div></div><p>
            Use the command <span class="command"><strong>cback validate</strong></span> to validate your
            configuration file. This command checks that the configuration file
            can be found and parsed, and also checks for typical configuration
            problems, such as invalid CD/DVD device entries.  This command
            <span class="emphasis"><em>only</em></span> validates configuration on the master,
            not any clients that the master might be configured to connect to.
         </p><p>
            Note: the most common cause of configuration problems is in not
            closing XML tags properly. Any XML tag that is
            <span class="quote">&#8220;<span class="quote">opened</span>&#8221;</span> must be <span class="quote">&#8220;<span class="quote">closed</span>&#8221;</span> appropriately.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60394880"></a>Step 8: Test connectivity to client machines.</h3></div></div></div><p>
            This step must wait until after your client machines have been at
            least partially configured. Once the backup user(s) have been
            configured on the client machine(s) in a pool, attempt an SSH
            connection to each client. 
         </p><p>
            Log in as the backup user on the master, and then use the command
            <span class="command"><strong>ssh user@machine</strong></span> where
            <em class="replaceable"><code>user</code></em> is the name of backup user
            <span class="emphasis"><em>on the client machine</em></span>, and
            <em class="replaceable"><code>machine</code></em> is the name of the client
            machine.
         </p><p>
            If you are able to log in successfully to each client without
            entering a password, then things have been configured properly.
            Otherwise, double-check that you followed the user setup
            instructions for the master and the clients.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60399696"></a>Step 9: Test your backup.</h3></div></div></div><p>
             Make sure that you have configured all of the clients in your
             backup pool. On all of the clients, execute <span class="command"><strong>cback --full
             collect</strong></span>.  (You will probably have already tested this
             command on each of the clients, so it should succeed.)
         </p><p>
            When all of the client backups have completed, place a valid CD/DVD
            disc in your drive, and then use the command <span class="command"><strong>cback --full
            all</strong></span>.  You should execute this command as root.  If the
            command completes with no output, then the backup was run
            successfully. 
         </p><p>
            Just to be sure that everything worked properly, check the logfile
            (<code class="filename">/var/log/cback.log</code>) on the master and each of
            the clients, and also mount the CD/DVD disc on the master to
            be sure it can be read.
         </p><p>
            You may also want to run <span class="command"><strong>cback purge</strong></span> on the
            master and each client once you have finished validating that
            everything worked.
         </p><p>
            <span class="emphasis"><em>If Cedar Backup ever completes <span class="quote">&#8220;<span class="quote">normally</span>&#8221;</span>
            but the disc that is created is not usable, please report this as a
            bug.
            <a href="#ftn.cedar-config-foot-bugs" class="footnoteref"><sup class="footnoteref">[22]</sup></a>
            To be safe, always enable the consistency check option in the
            store configuration section.</em></span>
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60407776"></a>Step 10: Modify the backup cron jobs.</h3></div></div></div><p>
            Since Cedar Backup should be run as root, you should add a set of
            lines like this to your <code class="filename">/etc/crontab</code> file:
         </p><pre class="programlisting">
30 00 * * * root  cback collect
30 02 * * * root  cback stage
30 04 * * * root  cback store
30 06 * * * root  cback purge
         </pre><p>
            You should consider adding the <code class="option">--output</code> or
            <code class="option">-O</code> switch to your <span class="command"><strong>cback</strong></span>
            command-line in cron.  This will result in larger logs, but could
            help diagnose problems when commands like
            <span class="command"><strong>cdrecord</strong></span> or <span class="command"><strong>mkisofs</strong></span> fail
            mysteriously.
         </p><p>
            You will need to coordinate the collect and purge actions on
            clients so that their collect actions complete before the master
            attempts to stage, and so that their purge actions do not begin
            until after the master has completed staging.  Usually, allowing
            an hour or two between steps should be sufficient.
            <a href="#ftn.cedar-config-foot-coordinate" class="footnoteref"><sup class="footnoteref">[23]</sup></a>
         </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
               For general information about using cron, see the manpage for
               crontab(5).
            </p><p>
               On a Debian system, execution of daily backups is controlled by
               the file <code class="filename">/etc/cron.d/cedar-backup2</code>.  As
               installed, this file contains several different settings, all
               commented out.  Uncomment the <span class="quote">&#8220;<span class="quote">Master machine</span>&#8221;</span>
               entries in the file, and change the lines so that the backup
               goes off when you want it to.
            </p></div></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-writer"></a>Configuring your Writer Device</h2></div></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60419328"></a>Device Types</h3></div></div></div><p>
            In order to execute the store action, you need to know how to
            identify your writer device.  Cedar Backup supports two kinds of
            device types: CD writers and DVD writers.  DVD writers are always
            referenced through a filesystem device name (i.e.
            <code class="filename">/dev/dvd</code>).  CD writers can be referenced
            either through a SCSI id, or through a filesystem device name.
            Which you use depends on your operating system and hardware.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60421696"></a>Devices identified by by device name</h3></div></div></div><p>
            For all DVD writers, and for CD writers on certain platforms, you
            will configure your writer device using only a device name.  If
            your writer device works this way, you should just specify
            &lt;target_device&gt; in configuration.  You can either leave
            &lt;target_scsi_id&gt; blank or remove it completely.  The writer
            device will be used both to write to the device and for filesystem
            operations &#8212; for instance, when the media needs to be mounted
            to run the consistency check.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60422768"></a>Devices identified by SCSI id</h3></div></div></div><p>
            Cedar Backup can use devices identified by SCSI id only when
            configured to use the <code class="literal">cdwriter</code> device type.
         </p><p>
            In order to use a SCSI device with Cedar Backup, you must know both the
            SCSI id &lt;target_scsi_id&gt; and the device name
            &lt;target_device&gt;.  The SCSI id will be used to write to media
            using <span class="command"><strong>cdrecord</strong></span>; and the device name will be used
            for other filesystem operations.
         </p><p>
            A true SCSI device will always have an address
            <code class="literal">scsibus,target,lun</code> (i.e.
            <code class="literal">1,6,2</code>).  This should hold true on most UNIX-like
            systems including Linux and the various BSDs (although I do not
            have a BSD system to test with currently).  The SCSI address
            represents the location of your writer device on the one or more
            SCSI buses that you have available on your system.  
         </p><p>
            On some platforms, it is possible to reference non-SCSI writer
            devices (i.e. an IDE CD writer) using an emulated SCSI id.  If you
            have configured your non-SCSI writer device to have an emulated
            SCSI id, provide the filesystem device path in
            &lt;target_device&gt; and the SCSI id in &lt;target_scsi_id&gt;,
            just like for a real SCSI device.
         </p><p>
            You should note that in some cases, an emulated SCSI id takes the
            same form as a normal SCSI id, while in other cases you might see a
            method name prepended to the normal SCSI id (i.e.
            <span class="quote">&#8220;<span class="quote">ATA:1,1,1</span>&#8221;</span>).
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60431648"></a>Linux Notes</h3></div></div></div><p>
            On a Linux system, IDE writer devices often have a emulated SCSI
            address, which allows SCSI-based software to access the device through
            an IDE-to-SCSI interface.  Under these circumstances, the first IDE
            writer device typically has an address <code class="literal">0,0,0</code>.  However,
            support for the IDE-to-SCSI interface has been deprecated and is not
            well-supported in newer kernels (kernel 2.6.x and later).
         </p><p>
            Newer Linux kernels can address <em class="firstterm">ATA</em> or
            <em class="firstterm">ATAPI</em> drives without SCSI emulation by
            prepending a <span class="quote">&#8220;<span class="quote">method</span>&#8221;</span> indicator to the emulated
            device address.  For instance, <code class="literal">ATA:0,0,0</code> or
            <code class="literal">ATAPI:0,0,0</code> are typical values.
         </p><p>
            However, even this interface is deprecated as of late 2006, so with
            relatively new kernels you may be better off using the filesystem
            device path directly rather than relying on any SCSI emulation.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60437936"></a>Finding your Linux CD Writer</h3></div></div></div><p>
            Here are some hints about how to find your Linux CD writer
            hardware.  First, try to reference your device using the filesystem
            device path:
         </p><pre class="screen">
cdrecord -prcap dev=/dev/cdrom
         </pre><p>
            Running this command on my hardware gives output that looks like
            this (just the top few lines):
         </p><pre class="screen">
Device type    : Removable CD-ROM
Version        : 0
Response Format: 2
Capabilities   : 
Vendor_info    : 'LITE-ON '
Identification : 'DVDRW SOHW-1673S'
Revision       : 'JS02'
Device seems to be: Generic mmc2 DVD-R/DVD-RW.

Drive capabilities, per MMC-3 page 2A:
         </pre><p>
            If this works, and the identifying information at the top of the
            output looks like your CD writer device, you've probably found a
            working configuration.  Place the device path into
            &lt;target_device&gt; and leave &lt;target_scsi_id&gt; blank.
         </p><p>
            If this doesn't work, you should try to find an ATA or ATAPI
            device:
         </p><pre class="screen">
cdrecord -scanbus dev=ATA
cdrecord -scanbus dev=ATAPI
         </pre><p>
            On my development system, I get a result that looks something like
            this for ATA:
         </p><pre class="screen">
scsibus1:
        1,0,0   100) 'LITE-ON ' 'DVDRW SOHW-1673S' 'JS02' Removable CD-ROM
        1,1,0   101) *
        1,2,0   102) *
        1,3,0   103) *
        1,4,0   104) *
        1,5,0   105) *
        1,6,0   106) *
        1,7,0   107) *
         </pre><p>
            Again, if you get a result that you recognize, you have again
            probably found a working configuraton.  Place the associated device
            path (in my case, <code class="literal">/dev/cdrom</code>) into
            &lt;target_device&gt; and put the emulated SCSI id
            (in this case, <code class="literal">ATA:1,0,0</code>) into &lt;target_scsi_id&gt;.
         </p><p>
            Any further discussion of how to configure your CD writer hardware
            is outside the scope of this document.  If you have tried the hints
            above and still can't get things working, you may want to reference
            the <em class="citetitle">Linux CDROM HOWTO</em> 
            (<a class="ulink" href="http://www.tldp.org/HOWTO/CDROM-HOWTO" target="_top">http://www.tldp.org/HOWTO/CDROM-HOWTO</a>) 
            or the <em class="citetitle">ATA RAID HOWTO</em> 
            (<a class="ulink" href="http://www.tldp.org/HOWTO/ATA-RAID-HOWTO/index.html" target="_top">http://www.tldp.org/HOWTO/ATA-RAID-HOWTO/index.html</a>) 
            for more information.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="idp60448992"></a>Mac OS X Notes</h3></div></div></div><p>
            On a Mac OS X (darwin) system, things get strange.  Apple has
            abandoned traditional SCSI device identifiers in favor of a
            system-wide resource id.  So, on a Mac, your writer device will
            have a name something like <code class="literal">IOCompactDiscServices</code>
            (for a CD writer) or <code class="literal">IODVDServices</code> (for a DVD
            writer).  If you have multiple drives, the second drive probably
            has a number appended, i.e. <code class="literal">IODVDServices/2</code> for
            the second DVD writer.  You can try to figure out what the name of
            your device is by grepping through the output of the command
            <span class="command"><strong>ioreg -l</strong></span>.<a href="#ftn.idp60453328" class="footnote" name="idp60453328"><sup class="footnote">[24]</sup></a>
         </p><p>
            Unfortunately, even if you can figure out what device to use, I
            can't really support the store action on this platform.  In OS X,
            the <span class="quote">&#8220;<span class="quote">automount</span>&#8221;</span> function of the Finder interferes
            significantly with Cedar Backup's ability to mount and unmount
            media and write to the CD or DVD hardware.  The Cedar Backup writer
            and image functionality does work on this platform, but the effort
            required to fight the operating system about who owns the media and
            the device makes it nearly impossible to execute the store action
            successfully.
         </p><p>
            If you are interested in some of my notes about what works and what
            doesn't on this platform, check out the documentation in the
            <code class="filename">doc/osx</code> directory in the source distribution.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-config-blanking"></a>Optimized Blanking Stategy</h2></div></div></div><p>
         When the optimized blanking strategy has not been configured, Cedar
         Backup uses a simplistic approach: rewritable media is blanked at the
         beginning of every week, period.  
      </p><p>
         Since rewritable media can be blanked only a finite number of times
         before becoming unusable, some users &#8212; especially users of
         rewritable DVD media with its large capacity &#8212; may prefer to
         blank the media less often.
      </p><p>
         If the optimized blanking strategy is configured, Cedar Backup will
         use a blanking factor and attempt to determine whether future backups
         will fit on the current media.  If it looks like backups will fit,
         then the media will not be blanked.
      </p><p>
         This feature will only be useful (assuming single disc is used for the
         whole week's backups) if the estimated total size of the weekly backup
         is considerably smaller than the capacity of the media (no more than
         50% of the total media capacity), and only if the size of the backup
         can be expected to remain fairly constant over time (no frequent rapid
         growth expected).
      </p><p>
         There are two blanking modes: daily and weekly.  If the weekly
         blanking mode is set, Cedar Backup will only estimate future capacity
         (and potentially blank the disc) once per week, on the starting day of
         the week.  If the daily blanking mode is set, Cedar Backup will
         estimate future capacity (and potentially blank the disc) every time
         it is run.  <span class="emphasis"><em>You should only use the daily blanking mode in
         conjunction with daily collect configuration, otherwise you will risk
         losing data.</em></span>
      </p><p>
         If you are using the daily blanking mode, you can typically set the
         blanking value to 1.0.  This will cause Cedar Backup to blank the
         media whenever there is not enough space to store the current day's
         backup.
      </p><p>
         If you are using the weekly blanking mode, then finding the correct
         blanking factor will require some experimentation.  Cedar Backup
         estimates future capacity based on the configured blanking factor.
         The disc will be blanked if the following relationship is true:
      </p><pre class="screen">
bytes available / (1 + bytes required) &#8804; blanking factor
      </pre><p>
         Another way to look at this is to consider the blanking factor as a
         sort of (upper) backup growth estimate:
      </p><pre class="screen">
Total size of weekly backup / Full backup size at the start of the week
      </pre><p>
         This ratio can be estimated using a week or two of previous backups.
         For instance, take this example, where March 10 is the start of the
         week and March 4 through March 9 represent the incremental backups
         from the previous week:
      </p><pre class="screen">
/opt/backup/staging# du -s 2007/03/*
3040    2007/03/01
3044    2007/03/02
6812    2007/03/03
3044    2007/03/04
3152    2007/03/05
3056    2007/03/06
3060    2007/03/07
3056    2007/03/08
4776    2007/03/09
6812    2007/03/10
11824   2007/03/11
      </pre><p>
         In this case, the ratio is approximately 4:
      </p><pre class="screen">
6812 + (3044 + 3152 + 3056 + 3060 + 3056 + 4776) / 6812 = 3.9571
      </pre><p>
         To be safe, you might choose to configure a factor of 5.0.  
      </p><p>
         Setting a higher value reduces the risk of exceeding media capacity
         mid-week but might result in blanking the media more often than is necessary.  
      </p><p>
         If you run out of space mid-week, then the solution is to run the
         rebuild action.  If this happens frequently, a higher blanking factor
         value should be used.
      </p></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.idp59626560" class="footnote"><p><a href="#idp59626560" class="para"><sup class="para">[19] </sup></a>See 
         <a class="ulink" href="http://www.xml.com/pub/a/98/10/guide0.html" target="_top">http://www.xml.com/pub/a/98/10/guide0.html</a>
         for a basic introduction to XML.</p></div><div id="ftn.idp59631744" class="footnote"><p><a href="#idp59631744" class="para"><sup class="para">[20] </sup></a>See <a class="xref" href="#cedar-basic-process" title="The Backup Process">the section called &#8220;The Backup Process&#8221;</a>, in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>.</p></div><div id="ftn.cedar-config-foot-regex" class="footnote"><p><a href="#cedar-config-foot-regex" class="para"><sup class="para">[21] </sup></a>See <a class="ulink" href="http://docs.python.org/lib/re-syntax.html" target="_top">http://docs.python.org/lib/re-syntax.html</a></p></div><div id="ftn.cedar-config-foot-bugs" class="footnote"><p><a href="#cedar-config-foot-bugs" class="para"><sup class="para">[22] </sup></a>
            See <a class="ulink" href="https://bitbucket.org/cedarsolutions/cedar-backup2/issues" target="_top">https://bitbucket.org/cedarsolutions/cedar-backup2/issues</a>.</p></div><div id="ftn.cedar-config-foot-coordinate" class="footnote"><p><a href="#cedar-config-foot-coordinate" class="para"><sup class="para">[23] </sup></a>See <a class="xref" href="#cedar-basic-coordinate" title="Coordination between Master and Clients">the section called &#8220;Coordination between Master and Clients&#8221;</a> in <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>.</p></div><div id="ftn.idp60453328" class="footnote"><p><a href="#idp60453328" class="para"><sup class="para">[24] </sup></a>Thanks to the
            file README.macosX in the cdrtools-2.01+01a01 source tree
            for this information</p></div></div></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-extensions"></a>Chapter 6. Official Extensions</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-extensions-sysinfo">System Information Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-amazons3">Amazon S3 Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-subversion">Subversion Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-mysql">MySQL Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-postgresql">PostgreSQL Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-mbox">Mbox Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-encrypt">Encrypt Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-split">Split Extension</a></span></dt><dt><span class="sect1"><a href="#cedar-extensions-capacity">Capacity Extension</a></span></dt></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-sysinfo"></a>System Information Extension</h2></div></div></div><p>
         The System Information Extension is a simple Cedar Backup extension
         used to save off important system recovery information that might be
         useful when reconstructing a <span class="quote">&#8220;<span class="quote">broken</span>&#8221;</span> system.  It is
         intended to be run either immediately before or immediately after the
         standard collect action.
      </p><p>
         This extension saves off the following information to the configured
         Cedar Backup collect directory.  Saved off data is always compressed
         using <span class="command"><strong>bzip2</strong></span>.
      </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>Currently-installed Debian packages via <span class="command"><strong>dpkg --get-selections</strong></span></p></li><li class="listitem"><p>Disk partition information via <span class="command"><strong>fdisk -l</strong></span></p></li><li class="listitem"><p>System-wide mounted filesystem contents, via <span class="command"><strong>ls -laR</strong></span></p></li></ul></div><p>
         The Debian-specific information is only collected on systems where
         <code class="filename">/usr/bin/dpkg</code> exists.
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;sysinfo&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.sysinfo&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;99&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and collect configuration
         sections in the standard Cedar Backup configuration file, but
         requires no new configuration of its own.  
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-amazons3"></a>Amazon S3 Extension</h2></div></div></div><p>
         The Amazon S3 extension writes data to Amazon S3 cloud storage rather
         than to physical media.  It is intended to replace the store action,
         but you can also use it alongside the store action if you'd prefer to
         backup your data in more than one place.  This extension must be run
         after the stage action.
      </p><p>
         The underlying functionality relies on the 
         <a class="ulink" href="http://aws.amazon.com/documentation/cli/" target="_top">AWS CLI</a> toolset.
         Before you use this extension, you need to set up your Amazon S3
         account and configure AWS CLI as detailed in Amazons's 
         <a class="ulink" href="http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-set-up.html" target="_top">setup guide</a>.
         The extension assumes that the backup is being executed as root, and
         switches over to the configured backup user to run the
         <span class="command"><strong>aws</strong></span> program.  So, make sure you configure the AWS
         CLI tools as the backup user and not root.  (This is different than
         the amazons3 sync tool extension, which executes AWS CLI command as
         the same user that is running the tool.)
      </p><p>
         When using physical media via the standard store action, there is an
         implicit limit to the size of a backup, since a backup must fit on a
         single disc.  Since there is no physical media, no such limit exists
         for Amazon S3 backups.  This leaves open the possibility that Cedar
         Backup might construct an unexpectedly-large backup that the
         administrator is not aware of.  Over time, this might become
         expensive, either in terms of network bandwidth or in terms of Amazon
         S3 storage and I/O charges.  To mitigate this risk, set a reasonable
         maximum size using the configuration elements shown below.  If the
         backup fails, you have a chance to review what made the backup larger
         than you expected, and you can either correct the problem (i.e. remove
         a large temporary directory that got inadvertently included in the
         backup) or change configuration to take into account the new "normal"
         maximum size.
      </p><p>
         You can optionally configure Cedar Backup to encrypt data before
         sending it to S3.  To do that, provide a complete command line using
         the <code class="literal">${input}</code> and <code class="literal">${output}</code>
         variables to represent the original input file and the encrypted
         output file.  This command will be executed as the backup user.
      </p><p>
         For instance, you can use something like this with GPG:
      </p><pre class="programlisting">
/usr/bin/gpg -c --no-use-agent --batch --yes --passphrase-file /home/backup/.passphrase -o ${output} ${input}
      </pre><p>
         The GPG mechanism depends on a strong passphrase for security.  One way to
         generate a strong passphrase is using your system random number generator, i.e.:
      </p><pre class="programlisting">
dd if=/dev/urandom count=20 bs=1 | xxd -ps
      </pre><p>
         (See <a class="ulink" href="http://security.stackexchange.com/questions/14867/gpg-encryption-security" target="_top">StackExchange</a>
         for more details about that advice.) If you decide to use encryption, make sure you
         save off the passphrase in a safe place, so you can get at your backup data
         later if you need to.  And obviously, make sure to set permissions on the
         passphrase file so it can only be read by the backup user.
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;amazons3&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.amazons3&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;201&lt;/index&gt; &lt;!-- just after stage --&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and staging configuration sections
         in the standard Cedar Backup configuration file, and then also
         requires its own <code class="literal">amazons3</code> configuration section.
         This is an example configuration section with encryption disabled:
      </p><pre class="programlisting">
&lt;amazons3&gt;
      &lt;s3_bucket&gt;example.com-backup/staging&lt;/s3_bucket&gt;
&lt;/amazons3&gt;
      </pre><p>
         The following elements are part of the Amazon S3 configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">warn_midnite</code></span></dt><dd><p>Whether to generate warnings for crossing midnite.</p><p>
                  This field indicates whether warnings should be generated
                  if the Amazon S3 operation has to cross a midnite boundary in
                  order to find data to write to the cloud.  For instance, a
                  warning would be generated if valid data was only
                  found in the day before or day after the current day.
               </p><p>
                  Configuration for some users is such that the amazons3
                  operation will always cross a midnite boundary, so they
                  will not care about this warning.  Other users will expect
                  to never cross a boundary, and want to be notified that
                  something <span class="quote">&#8220;<span class="quote">strange</span>&#8221;</span> might have happened.
               </p><p>
                  This field is optional.  If it doesn't exist, then
                  <code class="literal">N</code> will be assumed.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean (<code class="literal">Y</code> or <code class="literal">N</code>).
               </p></dd><dt><span class="term"><code class="literal">s3_bucket</code></span></dt><dd><p>The name of the Amazon S3 bucket that data will be written to.</p><p>
                  This field configures the S3 bucket that your data will be
                  written to.  In S3, buckets are named globally.  For
                  uniqueness, you would typically use the name of your domain
                  followed by some suffix, such as <code class="literal">example.com-backup</code>.  
                  If you want, you can specify a subdirectory within the bucket,
                  such as <code class="literal">example.com-backup/staging</code>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
               </p></dd><dt><span class="term"><code class="literal">encrypt</code></span></dt><dd><p>Command used to encrypt backup data before upload to S3</p><p>
                  If this field is provided, then data will be encrypted before
                  it is uploaded to Amazon S3.  You must provide the entire
                  command used to encrypt a file, including the
                  <code class="literal">${input}</code> and <code class="literal">${output}</code>
                  variables.  An example GPG command is shown above, but you
                  can use any mechanism you choose.  The command will be run as
                  the configured backup user.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> If provided, must be non-empty.
               </p></dd><dt><span class="term"><code class="literal">full_size_limit</code></span></dt><dd><p>Maximum size of a full backup</p><p>
                  If this field is provided, then a size limit will be applied
                  to full backups.  If the total size of the selected staging
                  directory is greater than the limit, then the backup will
                  fail.  
               </p><p>
                  You can enter this value in two different forms.  It can
                  either be a simple number, in which case the value is assumed
                  to be in bytes; or it can be a number followed by a unit
                  (KB, MB, GB).  
               </p><p>
                  Valid examples are <span class="quote">&#8220;<span class="quote">10240</span>&#8221;</span>, <span class="quote">&#8220;<span class="quote">250
                  MB</span>&#8221;</span> or <span class="quote">&#8220;<span class="quote">1.1 GB</span>&#8221;</span>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a value as described above, greater than zero.
               </p></dd><dt><span class="term"><code class="literal">incr_size_limit</code></span></dt><dd><p>Maximum size of an incremental backup</p><p>
                  If this field is provided, then a size limit will be applied
                  to incremental backups.  If the total size of the selected
                  staging directory is greater than the limit, then the backup
                  will fail.  
               </p><p>
                  You can enter this value in two different forms.  It can
                  either be a simple number, in which case the value is assumed
                  to be in bytes; or it can be a number followed by a unit
                  (KB, MB, GB).  
               </p><p>
                  Valid examples are <span class="quote">&#8220;<span class="quote">10240</span>&#8221;</span>, <span class="quote">&#8220;<span class="quote">250
                  MB</span>&#8221;</span> or <span class="quote">&#8220;<span class="quote">1.1 GB</span>&#8221;</span>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a value as described above, greater than zero.
               </p></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-subversion"></a>Subversion Extension</h2></div></div></div><p>
         The Subversion Extension is a Cedar Backup extension
         used to back up Subversion 
         <a href="#ftn.idp61469168" class="footnote" name="idp61469168"><sup class="footnote">[25]</sup></a>
         version control repositories via the Cedar Backup command line.
         It is intended to be run either immediately before or immediately
         after the standard collect action.
      </p><p>
         Each configured Subversion repository can be backed using the same
         collect modes allowed for filesystems in the standard Cedar Backup
         collect action (weekly, daily, incremental) and the output can be
         compressed using either <span class="command"><strong>gzip</strong></span> or
         <span class="command"><strong>bzip2</strong></span>.
      </p><p>
         There are two different kinds of Subversion repositories at this
         writing: BDB (Berkeley Database) and FSFS (a "filesystem within a
         filesystem").  This extension backs up both kinds of repositories in
         the same way, using <span class="command"><strong>svnadmin dump</strong></span> in an incremental
         mode.  
      </p><p>
         It turns out that FSFS repositories can also be backed up just like
         any other filesystem directory.  If you would rather do the backup
         that way, then use the normal collect action rather than this
         extension.  If you decide to do that, be sure to consult the
         Subversion documentation and make sure you understand the limitations
         of this kind of backup. 
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;subversion&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.subversion&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;99&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and collect configuration
         sections in the standard Cedar Backup configuration file, and then
         also requires its own <code class="literal">subversion</code> configuration
         section.  This is an example Subversion configuration section:
      </p><pre class="programlisting">
&lt;subversion&gt;
   &lt;collect_mode&gt;incr&lt;/collect_mode&gt;
   &lt;compress_mode&gt;bzip2&lt;/compress_mode&gt;
   &lt;repository&gt;
      &lt;abs_path&gt;/opt/public/svn/docs&lt;/abs_path&gt;
   &lt;/repository&gt;
   &lt;repository&gt;
      &lt;abs_path&gt;/opt/public/svn/web&lt;/abs_path&gt;
      &lt;compress_mode&gt;gzip&lt;/compress_mode&gt;
   &lt;/repository&gt;
   &lt;repository_dir&gt;
      &lt;abs_path&gt;/opt/private/svn&lt;/abs_path&gt;
      &lt;collect_mode&gt;daily&lt;/collect_mode&gt;
   &lt;/repository_dir&gt;
&lt;/subversion&gt;
      </pre><p>
         The following elements are part of the Subversion configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Default collect mode.</p><p>
                  The collect mode describes how frequently a Subversion
                  repository is backed up.  The Subversion extension
                  recognizes the same collect modes as the standard Cedar
                  Backup collect action (see <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>).
               </p><p>
                  This value is the collect mode that will be used by
                  default during the backup process.  Individual
                  repositories (below) may override this value.  If
                  <span class="emphasis"><em>all</em></span> individual repositories provide
                  their own value, then this default value may be omitted
                  from configuration.
               </p><p>
                  Note: if your backup device does not suppport multisession
                  discs, then you should probably use the
                  <code class="literal">daily</code> collect mode to avoid losing
                  data.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                  <code class="literal">daily</code>, <code class="literal">weekly</code> or
                  <code class="literal">incr</code>.
               </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Default compress mode.</p><p>
                  Subversion repositories backups are just
                  specially-formatted text files, and often compress quite
                  well using <span class="command"><strong>gzip</strong></span> or
                  <span class="command"><strong>bzip2</strong></span>.  The compress mode describes how
                  the backed-up data will be compressed, if at all.  
               </p><p>
                  This value is the compress mode that will be used by
                  default during the backup process.  Individual
                  repositories (below) may override this value.  If
                  <span class="emphasis"><em>all</em></span> individual repositories provide
                  their own value, then this default value may be omitted
                  from configuration.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                  <code class="literal">none</code>, <code class="literal">gzip</code> or
                  <code class="literal">bzip2</code>.
               </p></dd><dt><span class="term"><code class="literal">repository</code></span></dt><dd><p>A Subversion repository be collected.</p><p>
                  This is a subsection which contains information about
                  a specific Subversion repository to be backed up.
               </p><p>
                  This section can be repeated as many times as is necessary.
                  At least one repository or repository directory must be
                  configured.
               </p><p>
                  The <code class="literal">repository</code> subsection contains the
                  following fields:
               </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Collect mode for this repository.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default collect mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">daily</code>, <code class="literal">weekly</code> or
                           <code class="literal">incr</code>.
                        </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Compress mode for this repository.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default compress mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">none</code>, <code class="literal">gzip</code> or
                           <code class="literal">bzip2</code>.
                        </p></dd><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                           Absolute path of the Subversion repository to
                           back up.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                        </p></dd></dl></div></dd><dt><span class="term"><code class="literal">repository_dir</code></span></dt><dd><p>A Subversion parent repository directory be collected.</p><p>
                  This is a subsection which contains information about a
                  Subversion parent repository directory to be backed up.  Any
                  subdirectory immediately within this directory is assumed to
                  be a Subversion repository, and will be backed up.
               </p><p>
                  This section can be repeated as many times as is necessary.
                  At least one repository or repository directory must be
                  configured.
               </p><p>
                  The <code class="literal">repository_dir</code> subsection contains the
                  following fields:
               </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Collect mode for this repository.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default collect mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">daily</code>, <code class="literal">weekly</code> or
                           <code class="literal">incr</code>.
                        </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Compress mode for this repository.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default compress mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">none</code>, <code class="literal">gzip</code> or
                           <code class="literal">bzip2</code>.
                        </p></dd><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                           Absolute path of the Subversion repository to
                           back up.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                        </p></dd><dt><span class="term"><code class="literal">exclude</code></span></dt><dd><p>List of paths or patterns to exclude from the backup.</p><p>
                           This is a subsection which contains a set of paths
                           and patterns to be excluded within this subversion
                           parent directory.
                        </p><p>
                           This section is entirely optional, and if it exists
                           can also be empty.  
                        </p><p>
                           The exclude subsection can contain one or more of each of
                           the following fields:
                        </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">rel_path</code></span></dt><dd><p>
                                    A relative path to be excluded from the
                                    backup.
                                 </p><p>
                                    The path is assumed to be relative to the
                                    subversion parent directory itself.  For instance, if
                                    the configured subversion parent directory is
                                    <code class="filename">/opt/svn</code> a
                                    configured relative path of
                                    <code class="filename">software</code> would exclude the
                                    path <code class="filename">/opt/svn/software</code>.
                                 </p><p>
                                    This field can be repeated as many times as
                                    is necessary.
                                 </p><p>
                                    <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                                 </p></dd><dt><span class="term"><code class="literal">pattern</code></span></dt><dd><p>
                                    A pattern to be excluded from the backup.
                                 </p><p>
                                    The pattern must be a Python regular
                                    expression.  <a href="#ftn.cedar-config-foot-regex" class="footnoteref"><sup class="footnoteref">[21]</sup></a> 
                                    It is assumed to be bounded at front and
                                    back by the beginning and end of the
                                    string (i.e. it is treated as if it
                                    begins with <code class="literal">^</code> and
                                    ends with <code class="literal">$</code>).
                                 </p><p>
                                    This field can be repeated as many times as
                                    is necessary.
                                 </p><p>
                                    <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                                 </p></dd></dl></div></dd></dl></div></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-mysql"></a>MySQL Extension</h2></div></div></div><p>
         The MySQL Extension is a Cedar Backup extension used to back up MySQL
         <a href="#ftn.idp61553712" class="footnote" name="idp61553712"><sup class="footnote">[26]</sup></a> 
         databases via the Cedar Backup command line.  It is intended to be run
         either immediately before or immediately after the standard collect
         action.
      </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
            This extension always produces a full backup.  There is currently
            no facility for making incremental backups.  If/when someone has a
            need for this and can describe how to do it, I will update this
            extension or provide another.
         </p></div><p>
         The backup is done via the <span class="command"><strong>mysqldump</strong></span> command
         included with the MySQL product.  Output can be compressed using
         <span class="command"><strong>gzip</strong></span> or <span class="command"><strong>bzip2</strong></span>.  Administrators
         can configure the extension either to back up all databases or to back
         up only specific databases.  
      </p><p>
         The extension assumes that all configured databases can be backed up
         by a single user.  Often, the <span class="quote">&#8220;<span class="quote">root</span>&#8221;</span> database user will
         be used.  An alternative is to create a separate MySQL
         <span class="quote">&#8220;<span class="quote">backup</span>&#8221;</span> user and grant that user rights to read (but not
         write) various databases as needed.  This second option is probably
         your best choice.
      </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
            The extension accepts a username and password in configuration.
            However, you probably do not want to list those values in Cedar
            Backup configuration.  This is because Cedar Backup will provide these
            values to <span class="command"><strong>mysqldump</strong></span> via the command-line
            <code class="option">--user</code> and <code class="option">--password</code> switches,
            which will be visible to other users in the process listing.
         </p><p>
            Instead, you should configure the username and password in one of MySQL's
            configuration files.  Typically, that would be done by putting a stanza like
            this in <code class="filename">/root/.my.cnf</code>:
         </p><pre class="programlisting">
[mysqldump]
user     = root
password = &lt;secret&gt;
         </pre><p>
            Of course, if you are executing the backup as a user other than root, then
            you would create the file in that user's home directory instead.
         </p><p>
            As a side note, it is also possible to configure <code class="filename">.my.cnf</code>
            such that Cedar Backup can back up a remote database server:
         </p><pre class="programlisting">
[mysqldump]
host = remote.host
         </pre><p>
            For this to work, you will also need to grant privileges properly
            for the user which is executing the backup.  See your MySQL documentation 
            for more information about how this can be done.
         </p><p>
            Regardless of whether you are using <code class="filename">~/.my.cnf</code> or
            <code class="filename">/etc/cback.conf</code> to store database login and
            password information, you should be careful about who is allowed to
            view that information.  Typically, this means locking down permissions
            so that only the file owner can read the file contents (i.e. use mode
            <code class="literal">0600</code>).
         </p></div><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;mysql&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.mysql&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;99&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and collect configuration
         sections in the standard Cedar Backup configuration file, and then
         also requires its own <code class="literal">mysql</code> configuration section.
         This is an example MySQL configuration section:
      </p><pre class="programlisting">
&lt;mysql&gt;
   &lt;compress_mode&gt;bzip2&lt;/compress_mode&gt;
   &lt;all&gt;Y&lt;/all&gt;
&lt;/mysql&gt;
      </pre><p>
         If you have decided to configure login information in Cedar Backup
         rather than using MySQL configuration, then you would add the username
         and password fields to configuration:
      </p><pre class="programlisting">
&lt;mysql&gt;
   &lt;user&gt;root&lt;/user&gt;
   &lt;password&gt;password&lt;/password&gt;
   &lt;compress_mode&gt;bzip2&lt;/compress_mode&gt;
   &lt;all&gt;Y&lt;/all&gt;
&lt;/mysql&gt;
      </pre><p>
         The following elements are part of the MySQL configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">user</code></span></dt><dd><p>Database user.</p><p>
                  The database user that the backup should be executed as.
                  Even if you list more than one database (below) all backups
                  must be done as the same user.  Typically, this would be
                  <code class="literal">root</code> (i.e. the database root user, not the
                  system root user).
               </p><p>
                  This value is optional.  You should probably configure the
                  username and password in MySQL configuration instead, as
                  discussed above.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> If provided, must be
                  non-empty.
               </p></dd><dt><span class="term"><code class="literal">password</code></span></dt><dd><p>Password associated with the database user.</p><p>
                  This value is optional.  You should probably configure the
                  username and password in MySQL configuration instead, as
                  discussed above.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> If provided, must be
                  non-empty.
               </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Compress mode.</p><p>
                  MySQL databases dumps are just
                  specially-formatted text files, and often compress quite
                  well using <span class="command"><strong>gzip</strong></span> or
                  <span class="command"><strong>bzip2</strong></span>.  The compress mode describes how
                  the backed-up data will be compressed, if at all.  
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                  <code class="literal">none</code>, <code class="literal">gzip</code> or
                  <code class="literal">bzip2</code>.
               </p></dd><dt><span class="term"><code class="literal">all</code></span></dt><dd><p>Indicates whether to back up all databases.</p><p>
                  If this value is <code class="literal">Y</code>, then all MySQL
                  databases will be backed up.  If this value is
                  <code class="literal">N</code>, then one or more specific databases
                  must be specified (see below).
               </p><p>
                  If you choose this option, the entire database backup will go
                  into one big dump file.  
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean
                  (<code class="literal">Y</code> or <code class="literal">N</code>).
               </p></dd><dt><span class="term"><code class="literal">database</code></span></dt><dd><p>Named database to be backed up.</p><p>
                  If you choose to specify individual databases rather than all
                  databases, then each database will be backed up into its own
                  dump file.
               </p><p>
                  This field can be repeated as many times as is necessary.  At
                  least one database must be configured if the all option
                  (above) is set to <code class="literal">N</code>.  You may not
                  configure any individual databases if the all option is set
                  to <code class="literal">Y</code>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
               </p></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-postgresql"></a>PostgreSQL Extension</h2></div></div></div><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Community-contributed Extension</b></p></div></div></div><p>
            This is a community-contributed extension provided by Antoine
            Beaupre ("The Anarcat").  I have added regression tests around
            the configuration parsing code and I will maintain this section
            in the user manual based on his source code documentation.
         </p><p>
            Unfortunately, I don't have any PostgreSQL databases with which to
            test the functional code.  While I have code-reviewed the code and
            it looks both sensible and safe, I have to rely on the author to
            ensure that it works properly.
         </p></div><p>
         The PostgreSQL Extension is a Cedar Backup extension used to back up
         PostgreSQL <a href="#ftn.idp61608000" class="footnote" name="idp61608000"><sup class="footnote">[27]</sup></a> databases via the
         Cedar Backup command line.  It is intended to be run either
         immediately before or immediately after the standard collect action. 
      </p><p>
         The backup is done via the <span class="command"><strong>pg_dump</strong></span> or
         <span class="command"><strong>pg_dumpall</strong></span> commands included with the PostgreSQL
         product.  Output can be compressed using <span class="command"><strong>gzip</strong></span> or
         <span class="command"><strong>bzip2</strong></span>.  Administrators can configure the extension
         either to back up all databases or to back up only specific databases.  
      </p><p>
         The extension assumes that the current user has passwordless access to
         the database since there is no easy way to pass a password to the
         <span class="command"><strong>pg_dump</strong></span> client. This can be accomplished using
         appropriate configuration in the <span class="command"><strong>pg_hda.conf</strong></span> file.
      </p><p>
         This extension always produces a full backup.  There is currently
         no facility for making incremental backups.
      </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
            Once you place PostgreSQL configuration into the Cedar Backup
            configuration file, you should be careful about who is allowed to
            see that information.  This is because PostgreSQL configuration
            will contain information about available PostgreSQL databases and
            usernames.  Typically, you might want to lock down permissions so
            that only the file owner can read the file contents (i.e. use mode
            <code class="literal">0600</code>).
         </p></div><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;postgresql&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.postgresql&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;99&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and collect configuration
         sections in the standard Cedar Backup configuration file, and then
         also requires its own <code class="literal">postgresql</code> configuration
         section.  This is an example PostgreSQL configuration section:
      </p><pre class="programlisting">
&lt;postgresql&gt;
   &lt;compress_mode&gt;bzip2&lt;/compress_mode&gt;
   &lt;user&gt;username&lt;/user&gt;
   &lt;all&gt;Y&lt;/all&gt;
&lt;/postgresql&gt;
      </pre><p>
         If you decide to back up specific databases, then you would list them
         individually, like this:
      </p><pre class="programlisting">
&lt;postgresql&gt;
   &lt;compress_mode&gt;bzip2&lt;/compress_mode&gt;
   &lt;user&gt;username&lt;/user&gt;
   &lt;all&gt;N&lt;/all&gt;
   &lt;database&gt;db1&lt;/database&gt;
   &lt;database&gt;db2&lt;/database&gt;
&lt;/postgresql&gt;
      </pre><p>
         The following elements are part of the PostgreSQL configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">user</code></span></dt><dd><p>Database user.</p><p>
                  The database user that the backup should be executed as.
                  Even if you list more than one database (below) all backups
                  must be done as the same user.
               </p><p>
                  This value is optional.
               </p><p>
                  Consult your PostgreSQL documentation for information on how
                  to configure a default database user outside of Cedar Backup,
                  and for information on how to specify a database password
                  when you configure a user within Cedar Backup.  You will
                  probably want to modify <span class="command"><strong>pg_hda.conf</strong></span>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> If provided, must be
                  non-empty.
               </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Compress mode.</p><p>
                  PostgreSQL databases dumps are just
                  specially-formatted text files, and often compress quite
                  well using <span class="command"><strong>gzip</strong></span> or
                  <span class="command"><strong>bzip2</strong></span>.  The compress mode describes how
                  the backed-up data will be compressed, if at all.  
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                  <code class="literal">none</code>, <code class="literal">gzip</code> or
                  <code class="literal">bzip2</code>.
               </p></dd><dt><span class="term"><code class="literal">all</code></span></dt><dd><p>Indicates whether to back up all databases.</p><p>
                  If this value is <code class="literal">Y</code>, then all PostgreSQL
                  databases will be backed up.  If this value is
                  <code class="literal">N</code>, then one or more specific databases
                  must be specified (see below).
               </p><p>
                  If you choose this option, the entire database backup will go
                  into one big dump file.  
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a boolean
                  (<code class="literal">Y</code> or <code class="literal">N</code>).
               </p></dd><dt><span class="term"><code class="literal">database</code></span></dt><dd><p>Named database to be backed up.</p><p>
                  If you choose to specify individual databases rather than all
                  databases, then each database will be backed up into its own
                  dump file.
               </p><p>
                  This field can be repeated as many times as is necessary.  At
                  least one database must be configured if the all option
                  (above) is set to <code class="literal">N</code>.  You may not
                  configure any individual databases if the all option is set
                  to <code class="literal">Y</code>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
               </p></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-mbox"></a>Mbox Extension</h2></div></div></div><p>
         The Mbox Extension is a Cedar Backup extension used to incrementally
         back up UNIX-style <span class="quote">&#8220;<span class="quote">mbox</span>&#8221;</span> mail folders via the Cedar
         Backup command line.  It is intended to be run either immediately
         before or immediately after the standard collect action.
      </p><p>
         Mbox mail folders are not well-suited to being backed up by the normal
         Cedar Backup incremental backup process.  This is because active
         folders are typically appended to on a daily basis.  This forces the
         incremental backup process to back them up every day in order to avoid
         losing data.  This can result in quite a bit of wasted space when
         backing up large mail folders.
      </p><p>
         What the mbox extension does is leverage the
         <span class="command"><strong>grepmail</strong></span> utility to back up only email messages
         which have been received since the last incremental backup.  This way,
         even if a folder is added to every day, only the recently-added
         messages are backed up.  This can potentially save a lot of space.
      </p><p>
         Each configured mbox file or directory can be backed using the same
         collect modes allowed for filesystems in the standard Cedar Backup
         collect action (weekly, daily, incremental) and the output can be
         compressed using either <span class="command"><strong>gzip</strong></span> or
         <span class="command"><strong>bzip2</strong></span>.
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;mbox&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.mbox&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;99&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and collect configuration
         sections in the standard Cedar Backup configuration file, and then
         also requires its own <code class="literal">mbox</code> configuration
         section.  This is an example mbox configuration section:
      </p><pre class="programlisting">
&lt;mbox&gt;
   &lt;collect_mode&gt;incr&lt;/collect_mode&gt;
   &lt;compress_mode&gt;gzip&lt;/compress_mode&gt;
   &lt;file&gt;
      &lt;abs_path&gt;/home/user1/mail/greylist&lt;/abs_path&gt;
      &lt;collect_mode&gt;daily&lt;/collect_mode&gt;
   &lt;/file&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/home/user2/mail&lt;/abs_path&gt;
   &lt;/dir&gt;
   &lt;dir&gt;
      &lt;abs_path&gt;/home/user3/mail&lt;/abs_path&gt;
      &lt;exclude&gt;
         &lt;rel_path&gt;spam&lt;/rel_path&gt;
         &lt;pattern&gt;.*debian.*&lt;/pattern&gt;
      &lt;/exclude&gt;
   &lt;/dir&gt;
&lt;/mbox&gt;
      </pre><p>
         Configuration is much like the standard collect action.  Differences
         come from the fact that mbox directories are <span class="emphasis"><em>not</em></span>
         collected recursively.
      </p><p>
         Unlike collect configuration, exclusion information can only be
         configured at the mbox directory level (there are no global
         exclusions).  Another difference is that no absolute exclusion
         paths are allowed &#8212; only relative path exclusions and patterns.
      </p><p>
         The following elements are part of the mbox configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Default collect mode.</p><p>
                  The collect mode describes how frequently an mbox file
                  or directory is backed up.  The mbox extension
                  recognizes the same collect modes as the standard Cedar
                  Backup collect action (see <a class="xref" href="#cedar-basic" title="Chapter 2. Basic Concepts">Chapter 2, <i>Basic Concepts</i></a>).
               </p><p>
                  This value is the collect mode that will be used by default
                  during the backup process.  Individual files or directories
                  (below) may override this value.  If <span class="emphasis"><em>all</em></span>
                  individual files or directories provide their own value, then
                  this default value may be omitted from configuration.
               </p><p>
                  Note: if your backup device does not suppport multisession
                  discs, then you should probably use the
                  <code class="literal">daily</code> collect mode to avoid losing
                  data.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                  <code class="literal">daily</code>, <code class="literal">weekly</code> or
                  <code class="literal">incr</code>.
               </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Default compress mode.</p><p>
                  Mbox file or directory backups are just text, and often compress
                  quite well using <span class="command"><strong>gzip</strong></span> or
                  <span class="command"><strong>bzip2</strong></span>.  The compress mode describes how
                  the backed-up data will be compressed, if at all.  
               </p><p>
                  This value is the compress mode that will be used by default
                  during the backup process.  Individual files or directories
                  (below) may override this value.  If <span class="emphasis"><em>all</em></span>
                  individual files or directories provide their own value, then
                  this default value may be omitted from configuration.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                  <code class="literal">none</code>, <code class="literal">gzip</code> or
                  <code class="literal">bzip2</code>.
               </p></dd><dt><span class="term"><code class="literal">file</code></span></dt><dd><p>An individual mbox file to be collected.</p><p>
                  This is a subsection which contains information about
                  an individual mbox file to be backed up.
               </p><p>
                  This section can be repeated as many times as is necessary.
                  At least one mbox file or directory must be configured.
               </p><p>
                  The file subsection contains the following fields:
               </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Collect mode for this file.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default collect mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">daily</code>, <code class="literal">weekly</code> or
                           <code class="literal">incr</code>.
                        </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Compress mode for this file.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default compress mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">none</code>, <code class="literal">gzip</code> or
                           <code class="literal">bzip2</code>.
                        </p></dd><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                           Absolute path of the mbox file to back up.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                        </p></dd></dl></div></dd><dt><span class="term"><code class="literal">dir</code></span></dt><dd><p>An mbox directory to be collected.</p><p>
                  This is a subsection which contains information about an mbox
                  directory to be backed up.  An mbox directory is a directory
                  containing mbox files.  Every file in an mbox directory is
                  assumed to be an mbox file.  Mbox directories are
                  <span class="emphasis"><em>not</em></span> collected recursively.  Only the
                  files immediately within the configured directory will be
                  backed-up and any subdirectories will be ignored.
               </p><p>
                  This section can be repeated as many times as is necessary.
                  At least one mbox file or directory must be configured.
               </p><p>
                  The dir subsection contains the following fields:
               </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">collect_mode</code></span></dt><dd><p>Collect mode for this file.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default collect mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">daily</code>, <code class="literal">weekly</code> or
                           <code class="literal">incr</code>.
                        </p></dd><dt><span class="term"><code class="literal">compress_mode</code></span></dt><dd><p>Compress mode for this file.</p><p>
                           This field is optional.  If it doesn't exist, the backup
                           will use the default compress mode.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be one of
                           <code class="literal">none</code>, <code class="literal">gzip</code> or
                           <code class="literal">bzip2</code>.
                        </p></dd><dt><span class="term"><code class="literal">abs_path</code></span></dt><dd><p>
                           Absolute path of the mbox directory to back up.
                        </p><p>
                           <span class="emphasis"><em>Restrictions:</em></span> Must be an absolute path.
                        </p></dd><dt><span class="term"><code class="literal">exclude</code></span></dt><dd><p>List of paths or patterns to exclude from the backup.</p><p>
                           This is a subsection which contains a set of paths
                           and patterns to be excluded within this mbox
                           directory.
                        </p><p>
                           This section is entirely optional, and if it exists can
                           also be empty.  
                        </p><p>
                           The exclude subsection can contain one or more of each of
                           the following fields:
                        </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">rel_path</code></span></dt><dd><p>
                                    A relative path to be excluded from the
                                    backup.
                                 </p><p>
                                    The path is assumed to be relative to the
                                    mbox directory itself.  For instance, if
                                    the configured mbox directory is
                                    <code class="filename">/home/user2/mail</code> a
                                    configured relative path of
                                    <code class="filename">SPAM</code> would exclude the
                                    path <code class="filename">/home/user2/mail/SPAM</code>.
                                 </p><p>
                                    This field can be repeated as many times as
                                    is necessary.
                                 </p><p>
                                    <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty.
                                 </p></dd><dt><span class="term"><code class="literal">pattern</code></span></dt><dd><p>
                                    A pattern to be excluded from the backup.
                                 </p><p>
                                    The pattern must be a Python regular
                                    expression.  <a href="#ftn.cedar-config-foot-regex" class="footnoteref"><sup class="footnoteref">[21]</sup></a> 
                                    It is assumed to be bounded at front and
                                    back by the beginning and end of the
                                    string (i.e. it is treated as if it
                                    begins with <code class="literal">^</code> and
                                    ends with <code class="literal">$</code>).
                                 </p><p>
                                    This field can be repeated as many times as
                                    is necessary.
                                 </p><p>
                                    <span class="emphasis"><em>Restrictions:</em></span> Must be non-empty
                                 </p></dd></dl></div></dd></dl></div></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-encrypt"></a>Encrypt Extension</h2></div></div></div><p>
         The Encrypt Extension is a Cedar Backup extension used to encrypt
         backups.  It does this by encrypting the contents of a master's
         staging directory each day after the stage action is run.  This way,
         backed-up data is encrypted both when sitting on the master and when
         written to disc.  This extension must be run before the standard store
         action, otherwise unencrypted data will be written to disc.
      </p><p>
         There are several differents ways encryption could have been built in
         to or layered on to Cedar Backup.  I asked the mailing list for
         opinions on the subject in January 2007 and did not get a lot of
         feedback, so I chose the option that was simplest to understand and
         simplest to implement.  If other encryption use cases make themselves
         known in the future, this extension can be enhanced or replaced.
      </p><p>
         Currently, this extension supports only GPG.  However, it would be
         straightforward to support other public-key encryption mechanisms,
         such as OpenSSL.
      </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
            If you decide to encrypt your backups, be <span class="emphasis"><em>absolutely
            sure</em></span> that you have your GPG secret key saved off
            someplace safe &#8212; someplace other than on your backup disc.
            If you lose your secret key, your backup will be useless.
         </p><p>
            I suggest that before you rely on this extension, you should
            execute a dry run and make sure you can successfully decrypt the
            backup that is written to disc.
         </p></div><p>
         Before configuring the Encrypt extension, you must configure GPG.
         Either create a new keypair or use an existing one.  Determine which
         user will execute your backup (typically root) and have that user
         import <span class="emphasis"><em>and lsign</em></span> the public half of the keypair.
         Then, save off the secret half of the keypair someplace safe, apart
         from your backup (i.e. on a floppy disk or USB drive).  Make sure you
         know the recipient name associated with the public key because you'll
         need it to configure Cedar Backup.  (If you can run 
         <span class="command"><strong>gpg -e -r "Recipient Name" file.txt</strong></span> 
         and it executes cleanly with no user interaction required, you should
         be OK.)
      </p><p>
         An encrypted backup has the same file structure as a normal backup, so
         all of the instructions in <a class="xref" href="#cedar-recovering" title="Appendix C. Data Recovery">Appendix C, <i>Data Recovery</i></a> apply.
         The only difference is that encrypted files will have an additional
         <code class="filename">.gpg</code> extension (so
         for instance <code class="filename">file.tar.gz</code> becomes
         <code class="filename">file.tar.gz.gpg</code>).  To recover decrypted data, simply
         log on as a user which has access to the secret key and decrypt the
         <code class="filename">.gpg</code> file that you are interested in.  Then, recover
         the data as usual.
      </p><p>
         Note: I am being intentionally vague about how to configure and use GPG,
         because I do not want to encourage neophytes to blindly use this
         extension.  If you do not already understand GPG well enough to follow
         the two paragraphs above, <span class="emphasis"><em>do not use this
         extension</em></span>.  Instead, before encrypting your backups, check
         out the excellent GNU Privacy Handbook at 
         <a class="ulink" href="http://www.gnupg.org/gph/en/manual.html" target="_top">http://www.gnupg.org/gph/en/manual.html</a> 
         and gain an understanding of how encryption can help you or hurt you.
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;encrypt&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.encrypt&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;301&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and staging configuration
         sections in the standard Cedar Backup configuration file, and then
         also requires its own <code class="literal">encrypt</code> configuration
         section.  This is an example Encrypt configuration section:
      </p><pre class="programlisting">
&lt;encrypt&gt;
   &lt;encrypt_mode&gt;gpg&lt;/encrypt_mode&gt;
   &lt;encrypt_target&gt;Backup User&lt;/encrypt_target&gt;
&lt;/encrypt&gt;
      </pre><p>
         The following elements are part of the Encrypt configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">encrypt_mode</code></span></dt><dd><p>Encryption mode.</p><p>
                  This value specifies which encryption mechanism will be used
                  by the extension.
               </p><p>
                  Currently, only the GPG public-key encryption mechanism is
                  supported.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be <code class="literal">gpg</code>.
               </p></dd><dt><span class="term"><code class="literal">encrypt_target</code></span></dt><dd><p>Encryption target.</p><p>
                  The value in this field is dependent on the encryption mode.
                  For the <code class="literal">gpg</code> mode, this is the name of the
                  recipient whose public key will be used to encrypt the backup
                  data, i.e. the value accepted by <span class="command"><strong>gpg -r</strong></span>.
               </p></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-split"></a>Split Extension</h2></div></div></div><p>
         The Split Extension is a Cedar Backup extension used to split up
         large files within staging directories.  It is probably only useful
         in combination with the <span class="command"><strong>cback-span</strong></span> command, which
         requires individual files within staging directories to each be
         smaller than a single disc.
      </p><p>
         You would normally run this action immediately after the standard
         stage action, but you could also choose to run it by hand immediately
         before running <span class="command"><strong>cback-span</strong></span>.
      </p><p>
         The split extension uses the standard UNIX <span class="command"><strong>split</strong></span>
         tool to split the large files up.  This tool simply splits the files
         on bite-size boundaries.  It has no knowledge of file formats.  
      </p><p>
         <span class="emphasis"><em>Note: this means that in order to recover the data in your
         original large file, you must have every file that the original file
         was split into.</em></span>  Think carefully about whether this is what
         you want.  It doesn't sound like a huge limitation.  However,
         <span class="command"><strong>cback-span</strong></span> might put an indivdual file on
         <span class="emphasis"><em>any</em></span> disc in a set &#8212; the files split from
         one larger file will not necessarily be together.  That means you will
         probably need every disc in your backup set in order to recover any
         data from the backup set.
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt; 
   &lt;action&gt;
      &lt;name&gt;split&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.split&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;299&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and staging configuration
         sections in the standard Cedar Backup configuration file, and then
         also requires its own <code class="literal">split</code> configuration
         section.  This is an example Split configuration section:
      </p><pre class="programlisting">
&lt;split&gt;
   &lt;size_limit&gt;250 MB&lt;/size_limit&gt;
   &lt;split_size&gt;100 MB&lt;/split_size&gt;
&lt;/split&gt;
      </pre><p>
         The following elements are part of the Split configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">size_limit</code></span></dt><dd><p>Size limit.</p><p>
                  Files with a size strictly larger than this limit will
                  be split by the extension.
               </p><p>
                  You can enter this value in two different forms.  It can
                  either be a simple number, in which case the value is assumed
                  to be in bytes; or it can be a number followed by a unit
                  (KB, MB, GB).  
               </p><p>
                  Valid examples are <span class="quote">&#8220;<span class="quote">10240</span>&#8221;</span>, <span class="quote">&#8220;<span class="quote">250
                  MB</span>&#8221;</span> or <span class="quote">&#8220;<span class="quote">1.1 GB</span>&#8221;</span>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a size as described above.
               </p></dd><dt><span class="term"><code class="literal">split_size</code></span></dt><dd><p>Split size.</p><p>
                  This is the size of the chunks that a large file will be split into.
                  The final chunk may be smaller if the split size doesn't divide
                  evenly into the file size.
               </p><p>
                  You can enter this value in two different forms.  It can
                  either be a simple number, in which case the value is assumed
                  to be in bytes; or it can be a number followed by a unit
                  (KB, MB, GB).  
               </p><p>
                  Valid examples are <span class="quote">&#8220;<span class="quote">10240</span>&#8221;</span>, <span class="quote">&#8220;<span class="quote">250
                  MB</span>&#8221;</span> or <span class="quote">&#8220;<span class="quote">1.1 GB</span>&#8221;</span>.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a size as described above.
               </p></dd></dl></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-extensions-capacity"></a>Capacity Extension</h2></div></div></div><p>
         The capacity extension checks the current capacity of the media in the
         writer and prints a warning if the media exceeds an indicated
         capacity.  The capacity is indicated either by a maximum percentage
         utilized or by a minimum number of bytes that must remain unused.
      </p><p>
         This action can be run at any time, but is probably best run as the
         last action on any given day, so you get as much notice as possible
         that your media is full and needs to be replaced.
      </p><p>
         To enable this extension, add the following section to the Cedar Backup
         configuration file:
      </p><pre class="programlisting">
&lt;extensions&gt; &lt;action&gt;
      &lt;name&gt;capacity&lt;/name&gt;
      &lt;module&gt;CedarBackup2.extend.capacity&lt;/module&gt;
      &lt;function&gt;executeAction&lt;/function&gt;
      &lt;index&gt;299&lt;/index&gt;
   &lt;/action&gt;
&lt;/extensions&gt;
      </pre><p>
         This extension relies on the options and store configuration sections
         in the standard Cedar Backup configuration file, and then also
         requires its own <code class="literal">capacity</code> configuration section.
         This is an example Capacity configuration section that configures the
         extension to warn if the media is more than 95.5% full:
      </p><pre class="programlisting">
&lt;capacity&gt;
   &lt;max_percentage&gt;95.5&lt;/max_percentage&gt;
&lt;/capacity&gt;
      </pre><p>
         This example configures the extension to warn if the media has fewer
         than 16 MB free:
      </p><pre class="programlisting">
&lt;capacity&gt;
   &lt;min_bytes&gt;16 MB&lt;/min_bytes&gt;
&lt;/capacity&gt;
      </pre><p>
         The following elements are part of the Capacity configuration section:
      </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="literal">max_percentage</code></span></dt><dd><p>Maximum percentage of the media that may be utilized.</p><p>
                  You must provide either this value <span class="emphasis"><em>or</em></span> the
                  <code class="literal">min_bytes</code> value.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a floating point
                  number between 0.0 and 100.0
               </p></dd><dt><span class="term"><code class="literal">min_bytes</code></span></dt><dd><p>Minimum number of free bytes that must be available.</p><p>
                  You can enter this value in two different forms.  It can
                  either be a simple number, in which case the value is assumed
                  to be in bytes; or it can be a number followed by a unit
                  (KB, MB, GB).  
               </p><p>
                  Valid examples are <span class="quote">&#8220;<span class="quote">10240</span>&#8221;</span>, <span class="quote">&#8220;<span class="quote">250
                  MB</span>&#8221;</span> or <span class="quote">&#8220;<span class="quote">1.1 GB</span>&#8221;</span>.
               </p><p>
                  You must provide either this value <span class="emphasis"><em>or</em></span> the
                  <code class="literal">max_percentage</code> value.
               </p><p>
                  <span class="emphasis"><em>Restrictions:</em></span> Must be a byte quantity as
                  described above.
               </p></dd></dl></div></div><div class="footnotes"><br><hr style="width:100; text-align:left;margin-left: 0"><div id="ftn.idp61469168" class="footnote"><p><a href="#idp61469168" class="para"><sup class="para">[25] </sup></a>See <a class="ulink" href="http://subversion.org" target="_top">http://subversion.org</a></p></div><div id="ftn.idp61553712" class="footnote"><p><a href="#idp61553712" class="para"><sup class="para">[26] </sup></a>See <a class="ulink" href="http://www.mysql.com" target="_top">http://www.mysql.com</a></p></div><div id="ftn.idp61608000" class="footnote"><p><a href="#idp61608000" class="para"><sup class="para">[27] </sup></a>See <a class="ulink" href="http://www.postgresql.org/" target="_top">http://www.postgresql.org/</a></p></div></div></div><div class="appendix"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-extenspec"></a>Appendix A. Extension Architecture Interface</h1></div></div></div><div class="simplesect"><div class="titlepage"></div><p>
         The Cedar Backup <em class="firstterm">Extension Architecture
         Interface</em> is the application programming interface used by
         third-party developers to write Cedar Backup extensions.  This
         appendix briefly specifies the interface in enough detail for
         someone to succesfully implement an extension.
      </p><p>
         You will recall that Cedar Backup extensions are third-party pieces of
         code which extend Cedar Backup's functionality.  Extensions can be
         invoked from the Cedar Backup command line and are allowed to place
         their configuration in Cedar Backup's configuration file.
      </p><p>
         There is a one-to-one mapping between a command-line extended action
         and an extension function.  The mapping is configured in the Cedar
         Backup configuration file using a section something like this:
      </p><pre class="programlisting">
&lt;extensions&gt;
   &lt;action&gt;
      &lt;name&gt;database&lt;/name&gt;
      &lt;module&gt;foo&lt;/module&gt;
      &lt;function&gt;bar&lt;/function&gt;
      &lt;index&gt;101&lt;/index&gt;
   &lt;/action&gt; 
&lt;/extensions&gt;
      </pre><p>
         In this case, the action <span class="quote">&#8220;<span class="quote">database</span>&#8221;</span> has been mapped to
         the extension function <code class="literal">foo.bar()</code>.  
      </p><p>
         Extension functions can take any actions they would like to once they
         have been invoked, but must abide by these rules:
      </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
               Extensions may not write to <code class="filename">stdout</code> or
               <code class="filename">stderr</code> using functions such as
               <code class="literal">print</code> or <code class="literal">sys.write</code>.
            </p></li><li class="listitem"><p>
               All logging must take place using the Python logging
               facility.  Flow-of-control logging should happen on the
               <code class="literal">CedarBackup2.log</code> topic.  Authors can assume
               that ERROR will always go to the terminal, that INFO and WARN
               will always be logged, and that DEBUG will be ignored unless
               debugging is enabled.
            </p></li><li class="listitem"><p>
               Any time an extension invokes a command-line utility, it must be
               done through the <code class="literal">CedarBackup2.util.executeCommand</code>
               function.  This will help keep Cedar Backup safer from
               format-string attacks, and will make it easier to consistently
               log command-line process output.
            </p></li><li class="listitem"><p>
               Extensions may not return any value.  
            </p></li><li class="listitem"><p>
               Extensions must throw a Python exception containing a
               descriptive message if processing fails.  Extension authors can
               use their judgement as to what constitutes failure; however, any
               problems during execution should result in either a thrown
               exception or a logged message.
            </p></li><li class="listitem"><p>
               Extensions may rely only on Cedar Backup functionality that is
               advertised as being part of the public interface.  This means
               that extensions cannot directly make use of methods, functions
               or values starting with with the <code class="literal">_</code> character.
               Furthermore, extensions should only rely on parts of the public
               interface that are documented in the online Epydoc
               documentation.
            </p></li><li class="listitem"><p>
               Extension authors are encouraged to extend the Cedar Backup
               public interface through normal methods of inheritence.
               However, no extension is allowed to directly change Cedar Backup
               code in a way that would affect how Cedar Backup itself executes
               when the extension has not been invoked.  For instance,
               extensions would not be allowed to add new command-line options
               or new writer types.
            </p></li><li class="listitem"><p>
               Extensions must be written to assume an empty locale set (no
               <code class="literal">$LC_*</code> settings) and
               <code class="literal">$LANG=C</code>.  For the typical open-source
               software project, this would imply writing output-parsing code
               against the English localization (if any).  The
               <code class="literal">executeCommand</code> function does sanitize the
               environment to enforce this configuration.
            </p></li></ol></div><p>
         Extension functions take three arguments: the path to configuration on
         disk, a <code class="literal">CedarBackup2.cli.Options</code> object
         representing the command-line options in effect, and a
         <code class="literal">CedarBackup2.config.Config</code> object representing
         parsed standard configuration.   
      </p><pre class="programlisting">
def function(configPath, options, config):
   """Sample extension function."""
   pass
      </pre><p>
         This interface is structured so that simple extensions can use
         standard configuration without having to parse it for themselves, but
         more complicated extensions can get at the configuration file on disk
         and parse it again as needed.
      </p><p>
         The interface to the <code class="literal">CedarBackup2.cli.Options</code> and
         <code class="literal">CedarBackup2.config.Config</code> classes has been
         thoroughly documented using Epydoc, and the documentation is available
         on the Cedar Backup website.  The interface is guaranteed to change
         only in backwards-compatible ways unless the Cedar Backup major
         version number is bumped (i.e. from 2 to 3).
      </p><p>
         If an extension needs to add its own configuration information to the
         Cedar Backup configuration file, this extra configuration must be
         added in a new configuration section using a name that does not
         conflict with standard configuration or other known extensions.
      </p><p>
         For instance, our hypothetical database extension might require
         configuration indicating the path to some repositories to back up.
         This information might go into a section something like this:
      </p><pre class="programlisting">
&lt;database&gt;
   &lt;repository&gt;/path/to/repo1&lt;/repository&gt;
   &lt;repository&gt;/path/to/repo2&lt;/repository&gt;
&lt;/database&gt;
      </pre><p>
         In order to read this new configuration, the extension code can either
         inherit from the <code class="literal">Config</code> object and create a
         subclass that knows how to parse the new <code class="literal">database</code>
         config section, or can write its own code to parse whatever it needs
         out of the file.  Either way, the resulting code is completely
         independent of the standard Cedar Backup functionality.
      </p></div></div><div class="appendix"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-depends"></a>Appendix B. Dependencies</h1></div></div></div><div class="simplesect"><div class="titlepage"></div><div class="variablelist"><dl class="variablelist"><dt><span class="term">Python 2.7</span></dt><dd><p>
                  Cedar Backup is written in Python 2 and requires version 2.7 or
                  greater of the language.  Python 2.7 was originally released on
                  4 Jul 2010, and is the last supported release of Python 2. As
                  of this writing, all current Linux and BSD distributions
                  include it.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="http://www.python.org" target="_top">http://www.python.org</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/python/python2.7" target="_top">http://packages.debian.org/stable/python/python2.7</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=python" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=python</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term">RSH Server and Client</span></dt><dd><p>
                  Although Cedar Backup will technically work with any RSH-compatible
                  server and client pair (such as the classic <span class="quote">&#8220;<span class="quote">rsh</span>&#8221;</span> client),
                  most users should only use an SSH (secure shell) server and client.
               </p><p>
                  The defacto standard today is OpenSSH.  Some systems package the server
                  and the client together, and others package the server and the client
                  separately.  Note that <em class="firstterm">master</em> nodes need an
                  SSH client, and <em class="firstterm">client</em> nodes need to run an
                  SSH server.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="http://www.openssh.com/" target="_top">http://www.openssh.com/</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/net/ssh" target="_top">http://packages.debian.org/stable/net/ssh</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=openssh" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=openssh</a></td></tr></tbody></table></div><p>
                  If you can't find SSH client or server packages for your
                  system, install from the package source, using the
                  <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>mkisofs</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>mkisofs</strong></span> command is used create ISO filesystem
                  images that can later be written to backup media.  
               </p><p>
                  On Debian platforms, <span class="command"><strong>mkisofs</strong></span> is not
                  distributed and <span class="command"><strong>genisoimage</strong></span> is used
                  instead.  The Debian package takes care of this for you.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="https://en.wikipedia.org/wiki/Cdrtools" target="_top">https://en.wikipedia.org/wiki/Cdrtools</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=mkisofs" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=mkisofs</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>cdrecord</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>cdrecord</strong></span> command is used to write
                  ISO images to CD media in a backup device.  
               </p><p>
                  On Debian platforms, <span class="command"><strong>cdrecord</strong></span> is not
                  distributed and <span class="command"><strong>wodim</strong></span> is used
                  instead.  The Debian package takes care of this for you.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="https://en.wikipedia.org/wiki/Cdrtools" target="_top">https://en.wikipedia.org/wiki/Cdrtools</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=cdrecord" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=cdrecord</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>dvd+rw-tools</strong></span></span></dt><dd><p>
                  The dvd+rw-tools package provides the
                  <span class="command"><strong>growisofs</strong></span> utility, which is used to write
                  ISO images to DVD media in a backup device.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="http://fy.chalmers.se/~appro/linux/DVD+RW/" target="_top">http://fy.chalmers.se/~appro/linux/DVD+RW/</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/utils/dvd+rw-tools" target="_top">http://packages.debian.org/stable/utils/dvd+rw-tools</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=dvd+rw-tools" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=dvd+rw-tools</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>eject</strong></span> and <span class="command"><strong>volname</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>eject</strong></span> command is used to open and
                  close the tray on a backup device (if the backup device has a
                  tray).  Sometimes, the tray must be opened and closed in
                  order to "reset" the device so it notices recent changes to a
                  disc.  
               </p><p>
                  The <span class="command"><strong>volname</strong></span> command is used to determine
                  the volume name of media in a backup device.  
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="http://sourceforge.net/projects/eject" target="_top">http://sourceforge.net/projects/eject</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/utils/eject" target="_top">http://packages.debian.org/stable/utils/eject</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=eject" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=eject</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>mount</strong></span> and <span class="command"><strong>umount</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>mount</strong></span> and <span class="command"><strong>umount</strong></span>
                  commands are used to mount and unmount CD/DVD media after it has
                  been written, in order to run a consistency check.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="https://www.kernel.org/pub/linux/utils/util-linux/" target="_top">https://www.kernel.org/pub/linux/utils/util-linux/</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/base/mount" target="_top">http://packages.debian.org/stable/base/mount</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=mount" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=mount</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>grepmail</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>grepmail</strong></span> command is used by the mbox
                  extension to pull out only recent messages from mbox mail
                  folders. 
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="http://sourceforge.net/projects/grepmail/" target="_top">http://sourceforge.net/projects/grepmail/</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/mail/grepmail" target="_top">http://packages.debian.org/stable/mail/grepmail</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=grepmail" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=grepmail</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>gpg</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>gpg</strong></span> command is used by the encrypt
                  extension to encrypt files.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="https://www.gnupg.org/" target="_top">https://www.gnupg.org/</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="http://packages.debian.org/stable/utils/gnupg" target="_top">http://packages.debian.org/stable/utils/gnupg</a></td></tr><tr><td>RPM</td><td><a class="ulink" href="http://rpmfind.net/linux/rpm2html/search.php?query=gnupg" target="_top">http://rpmfind.net/linux/rpm2html/search.php?query=gnupg</a></td></tr></tbody></table></div><p>
                  If you can't find a package for your system, install from the package
                  source, using the <span class="quote">&#8220;<span class="quote">upstream</span>&#8221;</span> link.
               </p></dd><dt><span class="term"><span class="command"><strong>split</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>split</strong></span> command is used by the split
                  extension to split up large files.
               </p><p>
                  This command is typically part of the core operating system
                  install and is not distributed in a separate package.
               </p></dd><dt><span class="term"><span class="command"><strong>AWS CLI</strong></span></span></dt><dd><p>
                  AWS CLI is Amazon's official command-line tool for interacting
                  with the Amazon Web Services infrastruture.  Cedar Backup uses
                  AWS CLI to copy backup data up to Amazon S3 cloud storage.
               </p><p>
                  After you install AWS CLI, you need to configure your connection
                  to AWS with an appropriate access id and access key. Amazon provides a good 
                  <a class="ulink" href="http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-set-up.html" target="_top">setup guide</a>.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="http://aws.amazon.com/documentation/cli/" target="_top">http://aws.amazon.com/documentation/cli/</a></td></tr><tr><td>Debian</td><td><a class="ulink" href="https://packages.debian.org/stable/awscli" target="_top">https://packages.debian.org/stable/awscli</a></td></tr></tbody></table></div><p>
                  The initial implementation of the amazons3 extension was written
                  using AWS CLI 1.4.  As of this writing, not all Linux distributions
                  include a package for this version.  On these platforms, the
                  easiest way to install it is via PIP: <code class="literal">apt-get install python-pip</code>,
                  and then <code class="literal">pip install awscli</code>.  The Debian package includes
                  an appropriate dependency starting with the jessie release.
               </p></dd><dt><span class="term"><span class="command"><strong>Chardet</strong></span></span></dt><dd><p>
                  The <span class="command"><strong>cback-amazons3-sync</strong></span> command relies on the
                  Chardet python package to check filename encoding.  You only need
                  this package if you are going to use the sync tool.
               </p><div class="informaltable"><table border="1"><colgroup><col><col></colgroup><thead><tr><th>Source</th><th>URL</th></tr></thead><tbody><tr><td>upstream</td><td><a class="ulink" href="https://github.com/chardet/chardet" target="_top">https://github.com/chardet/chardet</a></td></tr><tr><td>debian</td><td><a class="ulink" href="https://packages.debian.org/stable/python-chardet" target="_top">https://packages.debian.org/stable/python-chardet</a></td></tr></tbody></table></div></dd></dl></div></div></div><div class="appendix"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-recovering"></a>Appendix C. Data Recovery</h1></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl class="toc"><dt><span class="sect1"><a href="#cedar-recovering-finding">Finding your Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-filesystem">Recovering Filesystem Data</a></span></dt><dd><dl><dt><span class="sect2"><a href="#cedar-recovering-filesystem-full">Full Restore</a></span></dt><dt><span class="sect2"><a href="#cedar-recovering-filesystem-partial">Partial Restore</a></span></dt></dl></dd><dt><span class="sect1"><a href="#cedar-recovering-mysql">Recovering MySQL Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-subversion">Recovering Subversion Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-mbox">Recovering Mailbox Data</a></span></dt><dt><span class="sect1"><a href="#cedar-recovering-split">Recovering Data split by the Split Extension</a></span></dt></dl></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-recovering-finding"></a>Finding your Data</h2></div></div></div><p>
         The first step in data recovery is finding the data that you want to
         recover.  You need to decide whether you are going to to restore off
         backup media, or out of some existing staging data that has not yet
         been purged.  The only difference is, if you purge staging data less
         frequently than once per week, you might have some data available in
         the staging directories which would not be found on your backup media,
         depending on how you rotate your media.  (And of course, if your
         system is trashed or stolen, you probably will not have access to your
         old staging data in any case.)
      </p><p>
         Regardless of the data source you choose, you will find the data
         organized in the same way.  The remainder of these examples will work
         off an example backup disc, but the contents of the staging directory
         will look pretty much like the contents of the disc, with data
         organized first by date and then by backup peer name.
      </p><p>
         This is the root directory of my example disc:
      </p><pre class="screen">
root:/mnt/cdrw# ls -l
total 4
drwxr-x---  3 backup backup 4096 Sep 01 06:30 2005/
      </pre><p>
         In this root directory is one subdirectory for each year represented
         in the backup.  In this example, the backup represents data entirely
         from the year 2005.  If your configured backup week happens to span a
         year boundary, there would be two subdirectories here (for example,
         one for 2005 and one for 2006).
      </p><p>
         Within each year directory is one subdirectory for each month
         represented in the backup.  
      </p><pre class="screen">
root:/mnt/cdrw/2005# ls -l
total 2
dr-xr-xr-x  6 root root 2048 Sep 11 05:30 09/
      </pre><p>
         In this example, the backup represents data entirely from the month of
         September, 2005.  If your configured backup week happens to span a month
         boundary, there would be two subdirectories here (for example, one for
         August 2005 and one for September 2005).  
      </p><p>
         Within each month directory is one subdirectory for each day represented
         in the backup.
      </p><pre class="screen">
root:/mnt/cdrw/2005/09# ls -l
total 8
dr-xr-xr-x  5 root root 2048 Sep  7 05:30 07/
dr-xr-xr-x  5 root root 2048 Sep  8 05:30 08/
dr-xr-xr-x  5 root root 2048 Sep  9 05:30 09/
dr-xr-xr-x  5 root root 2048 Sep 11 05:30 11/
      </pre><p>
         Depending on how far into the week your backup media is from, you might
         have as few as one daily directory in here, or as many as seven.  
      </p><p>
         Within each daily directory is a stage indicator (indicating when
         the directory was staged) and one directory for each peer configured
         in the backup:
      </p><pre class="screen">
root:/mnt/cdrw/2005/09/07# ls -l
total 10
dr-xr-xr-x  2 root root 2048 Sep  7 02:31 host1/
-r--r--r--  1 root root    0 Sep  7 03:27 cback.stage
dr-xr-xr-x  2 root root 4096 Sep  7 02:30 host2/
dr-xr-xr-x  2 root root 4096 Sep  7 03:23 host3/
      </pre><p>
         In this case, you can see that my backup includes three machines, and
         that the backup data was staged on September 7, 2005 at 03:27.
      </p><p>
         Within the directory for a given host are all of the files collected
         on that host.  This might just include tarfiles from a normal Cedar
         Backup collect run, and might also include files
         <span class="quote">&#8220;<span class="quote">collected</span>&#8221;</span> from Cedar Backup extensions or by other
         third-party processes on your system.
      </p><pre class="screen">
root:/mnt/cdrw/2005/09/07/host1# ls -l
total 157976
-r--r--r--  1 root root 11206159 Sep  7 02:30 boot.tar.bz2
-r--r--r--  1 root root        0 Sep  7 02:30 cback.collect
-r--r--r--  1 root root     3199 Sep  7 02:30 dpkg-selections.txt.bz2
-r--r--r--  1 root root   908325 Sep  7 02:30 etc.tar.bz2
-r--r--r--  1 root root      389 Sep  7 02:30 fdisk-l.txt.bz2
-r--r--r--  1 root root  1003100 Sep  7 02:30 ls-laR.txt.bz2
-r--r--r--  1 root root    19800 Sep  7 02:30 mysqldump.txt.bz2
-r--r--r--  1 root root  4133372 Sep  7 02:30 opt-local.tar.bz2
-r--r--r--  1 root root 44794124 Sep  8 23:34 opt-public.tar.bz2
-r--r--r--  1 root root 30028057 Sep  7 02:30 root.tar.bz2
-r--r--r--  1 root root  4747070 Sep  7 02:30 svndump-0:782-opt-svn-repo1.txt.bz2
-r--r--r--  1 root root   603863 Sep  7 02:30 svndump-0:136-opt-svn-repo2.txt.bz2
-r--r--r--  1 root root   113484 Sep  7 02:30 var-lib-jspwiki.tar.bz2
-r--r--r--  1 root root 19556660 Sep  7 02:30 var-log.tar.bz2
-r--r--r--  1 root root 14753855 Sep  7 02:30 var-mail.tar.bz2
         </pre><p>
         As you can see, I back up variety of different things on host1.  I run
         the normal collect action, as well as the sysinfo, mysql and
         subversion extensions.   The resulting backup files are named in a way
         that makes it easy to determine what they represent.
      </p><p>
         Files of the form <code class="filename">*.tar.bz2</code> represent directories
         backed up by the collect action.  The first part of the name (before
         <span class="quote">&#8220;<span class="quote">.tar.bz2</span>&#8221;</span>), represents the path to the directory.  For
         example, <code class="filename">boot.tar.gz</code> contains data from
         <code class="filename">/boot</code>, and
         <code class="filename">var-lib-jspwiki.tar.bz2</code> contains data from
         <code class="filename">/var/lib/jspwiki</code>.
      </p><p>
         The <code class="filename">fdisk-l.txt.bz2</code>,
         <code class="filename">ls-laR.tar.bz2</code> and
         <code class="filename">dpkg-selections.tar.bz2</code> files are produced by the
         sysinfo extension.
      </p><p>
         The <code class="filename">mysqldump.txt.bz2</code> file is produced by the
         mysql extension.  It represents a system-wide database dump, because I
         use the <span class="quote">&#8220;<span class="quote">all</span>&#8221;</span> flag in configuration.  If I were to
         configure Cedar Backup to dump individual datbases, then the filename
         would contain the database name (something like
         <code class="filename">mysqldump-bugs.txt.bz2</code>).
      </p><p>
         Finally, the files of the form <code class="filename">svndump-*.txt.bz2</code>
         are produced by the subversion extension.  There is one dump file for
         each configured repository, and the dump file name represents the name
         of the repository and the revisions in that dump.  So, the file
         <code class="filename">svndump-0:782-opt-svn-repo1.txt.bz2</code>
         represents revisions 0-782 of the repository at
         <code class="filename">/opt/svn/repo1</code>.  You can tell that this
         file contains a full backup of the repository to this point, because
         the starting revision is zero.  Later incremental backups would have a
         non-zero starting revision, i.e. perhaps 783-785, followed by 786-800,
         etc.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-recovering-filesystem"></a>Recovering Filesystem Data</h2></div></div></div><p>
         Filesystem data is gathered by the standard Cedar Backup collect
         action.  This data is placed into files of the form
         <code class="filename">*.tar</code>.  The first part of the name (before
         <span class="quote">&#8220;<span class="quote">.tar</span>&#8221;</span>), represents the path to the directory.  For
         example, <code class="filename">boot.tar</code> would contain data from
         <code class="filename">/boot</code>, and
         <code class="filename">var-lib-jspwiki.tar</code> would contain data from
         <code class="filename">/var/lib/jspwiki</code>.  (As a special case, data from
         the root directory would be placed in <code class="filename">-.tar</code>).
         Remember that your tarfile might have a bzip2
         (<code class="filename">.bz2</code>) or gzip (<code class="filename">.gz</code>)
         extension, depending on what compression you specified in
         configuration.
      </p><p>
         If you are using full backups every day, the latest backup data is
         always within the latest daily directory stored on your backup media or
         within your staging directory.  If you have some or all of your
         directories configured to do incremental backups, then the first day
         of the week holds the full backups and the other days represent
         incremental differences relative to that first day of the week.
      </p><div class="sidebar"><div class="titlepage"><div><div><p class="title"><b>Where to extract your backup</b></p></div></div></div><p>
            If you are restoring a home directory or some other non-system
            directory as part of a full restore, it is probably fine to extract
            the backup directly into the filesystem.  
         </p><p>
            If you are restoring a system directory like
            <code class="filename">/etc</code> as part of a full restore, extracting
            directly into the filesystem is likely to break things, especially
            if you re-installed a newer version of your operating system than
            the one you originally backed up.  It's better to extract
            directories like this to a temporary location and pick out only the
            files you find you need.
         </p><p>
            When doing a partial restore, I suggest <span class="emphasis"><em>always</em></span>
            extracting to a temporary location.  Doing it this way gives you
            more control over what you restore, and helps you avoid compounding
            your original problem with another one (like overwriting the wrong
            file, oops).
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-recovering-filesystem-full"></a>Full Restore</h3></div></div></div><p>
            To do a full system restore, find the newest applicable full backup
            and extract it.  If you have some incremental backups, extract them
            into the same place as the full backup, one by one starting from
            oldest to newest.  (This way, if a file changed every day you will
            always get the latest one.)
         </p><p>
            All of the backed-up files are stored in the tar file in a relative
            fashion, so you can extract from the tar file either directly into
            the filesystem, or into a temporary location.   
         </p><p>
            For example, to restore <code class="filename">boot.tar.bz2</code> directly
            into <code class="filename">/boot</code>, execute <span class="command"><strong>tar</strong></span>
            from your root directory (<code class="filename">/</code>):
         </p><pre class="screen">
root:/# bzcat boot.tar.bz2 | tar xvf -
         </pre><p>
            Of course, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span>,
            depending on what kind of compression is in use.
         </p><p>
            If you want to extract <code class="filename">boot.tar.gz</code> into a
            temporary location like <code class="filename">/tmp/boot</code> instead,
            just change directories first.  In this case, you'd execute the
            <span class="command"><strong>tar</strong></span> command from within
            <code class="filename">/tmp</code> instead of <code class="filename">/</code>.
         </p><pre class="screen">
root:/tmp# bzcat boot.tar.bz2 | tar xvf -
         </pre><p>
            Again, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span> as
            appropriate.
         </p><p>
            For more information, you might want to check out the manpage or
            GNU info documentation for the <span class="command"><strong>tar</strong></span> command.
         </p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a name="cedar-recovering-filesystem-partial"></a>Partial Restore</h3></div></div></div><p>
            Most users will need to do a partial restore much more frequently
            than a full restore.  Perhaps you accidentally removed your home
            directory, or forgot to check in some version of a file before
            deleting it.  Or, perhaps the person who packaged Apache for your
            system blew away your web server configuration on upgrade (it
            happens).  The solution to these and other kinds of problems is a
            partial restore (assuming you've backed up the proper things).
         </p><p>
            The procedure is similar to a full restore.  The specific steps
            depend on how much information you have about the file you are
            looking for.  Where with a full restore, you can confidently
            extract the full backup followed by each of the incremental
            backups, this might not be what you want when doing a partial
            restore.  You may need to take more care in finding the right
            version of a file &#8212; since the same file, if changed frequently,
            would appear in more than one backup.
         </p><p>
            Start by finding the backup media that contains the file you are
            looking for.  If you rotate your backup media, and your last known
            <span class="quote">&#8220;<span class="quote">contact</span>&#8221;</span> with the file was a while ago, you may need
            to look on older media to find it.  This may take some effort if
            you are not sure when the change you are trying to correct took
            place.
         </p><p>
            Once you have decided to look at a particular piece of backup media, 
            find the correct peer (host), and look for the file in the full backup:
         </p><pre class="screen">
root:/tmp# bzcat boot.tar.bz2 | tar tvf - path/to/file
         </pre><p>
            Of course, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span>,
            depending on what kind of compression is in use.
         </p><p>
            The <code class="option">tvf</code> tells <span class="command"><strong>tar</strong></span> to search for
            the file in question and just list the results rather than
            extracting the file.  Note that the filename is relative (with no
            starting <code class="literal">/</code>).  Alternately, you can omit the
            <code class="filename">path/to/file</code> and search through the output
            using <span class="command"><strong>more</strong></span> or <span class="command"><strong>less</strong></span>
         </p><p>
            If you haven't found what you are looking for, work your way through the
            incremental files for the directory in question.  One of them may also
            have the file if it changed during the course of the backup.  Or, move
            to older or newer media and see if you can find the file there.
         </p><p>
            Once you have found your file, extract it using <code class="option">xvf</code>:
         </p><pre class="screen">
root:/tmp# bzcat boot.tar.bz2 | tar xvf - path/to/file
         </pre><p>
            Again, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span> as
            appropriate.
         </p><p>
            Inspect the file and make sure it's what you're looking for.
            Again, you may need to move to older or newer media to find the
            exact version of your file.
         </p><p>
            For more information, you might want to check out the manpage or
            GNU info documentation for the <span class="command"><strong>tar</strong></span> command.
         </p></div></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-recovering-mysql"></a>Recovering MySQL Data</h2></div></div></div><p>
         MySQL data is gathered by the Cedar Backup mysql extension.  This
         extension always creates a full backup each time it runs.  This wastes
         some space, but makes it easy to restore database data.  The following
         procedure describes how to restore your MySQL database from the
         backup.
      </p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>
            I am not a MySQL expert.  I am providing this information for
            reference.  I have tested these procedures on my own MySQL
            installation; however, I only have a single database for use by
            Bugzilla, and I may have misunderstood something with regard to
            restoring individual databases as a user other than root.  If you
            have any doubts, test the procedure below before relying on it!
         </p><p>
            MySQL experts and/or knowledgable Cedar Backup users: feel free to
            write me and correct any part of this procedure.
         </p></div><p>
         First, find the backup you are interested in.  If you have specified
         <span class="quote">&#8220;<span class="quote">all databases</span>&#8221;</span> in configuration, you will have a single
         backup file, called <code class="filename">mysqldump.txt</code>.  If you have
         specified individual databases in configuration, then you will have
         files with names like <code class="filename">mysqldump-database.txt</code>
         instead.  In either case, your file might have a
         <code class="filename">.gz</code> or <code class="filename">.bz2</code> extension
         depending on what kind of compression you specified in configuration.
      </p><p>
         If you are restoring an <span class="quote">&#8220;<span class="quote">all databases</span>&#8221;</span> backup, make sure
         that you have correctly created the root user and know its password.
         Then, execute:
      </p><pre class="screen">
daystrom:/# bzcat mysqldump.txt.bz2 | mysql -p -u root
      </pre><p>
         Of course, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span>,
         depending on what kind of compression is in use.
      </p><p>
         Because the database backup includes <code class="literal">CREATE
         DATABASE</code> SQL statements, this command should take care of
         creating all of the databases within the backup, as well as populating
         them.  
      </p><p>
         If you are restoring a backup for a specific database, you have two
         choices.  If you have a root login, you can use the same command
         as above:
      </p><pre class="screen">
daystrom:/# bzcat mysqldump-database.txt.bz2 | mysql -p -u root
      </pre><p>
         Otherwise, you can create the database and its login first (or have
         someone create it) and then use a database-specific login to execute
         the restore:
      </p><pre class="screen">
daystrom:/# bzcat mysqldump-database.txt.bz2 | mysql -p -u user database
      </pre><p>
         Again, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span> as
         appropriate.
      </p><p>
         For more information on using MySQL, see the documentation on the
         MySQL web site, <a class="ulink" href="http://mysql.org/" target="_top">http://mysql.org/</a>, or the manpages
         for the <span class="command"><strong>mysql</strong></span> and <span class="command"><strong>mysqldump</strong></span>
         commands.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-recovering-subversion"></a>Recovering Subversion Data</h2></div></div></div><p>
         Subversion data is gathered by the Cedar Backup subversion extension.
         Cedar Backup will create either full or incremental backups, but the
         procedure for restoring is the same for both.  Subversion backups
         are always taken on a per-repository basis.  If you need to restore
         more than one repository, follow the procedures below for each repository
         you are interested in.
      </p><p>
         First, find the backup or backups you are interested in.  Typically,
         you will need the full backup from the first day of the week and each
         incremental backup from the other days of the week.  
      </p><p>
         The subversion extension creates files of the form
         <code class="filename">svndump-*.txt</code>.  These files might have a
         <code class="filename">.gz</code> or <code class="filename">.bz2</code> extension
         depending on what kind of compression you specified in configuration.
         There is one dump file for each configured repository, and the dump
         file name represents the name of the repository and the revisions in
         that dump.  So, the file
         <code class="filename">svndump-0:782-opt-svn-repo1.txt.bz2</code>
         represents revisions 0-782 of the repository at
         <code class="filename">/opt/svn/repo1</code>.  You can tell that this
         file contains a full backup of the repository to this point, because
         the starting revision is zero.  Later incremental backups would have a
         non-zero starting revision, i.e. perhaps 783-785, followed by 786-800,
         etc.
      </p><p>
         Next, if you still have the old Subversion repository around, you
         might want to just move it off (rename the top-level directory) before
         executing the restore.  Or, you can restore into a temporary directory
         and rename it later to its real name once you've checked it out.  That
         is what my example below will show.
      </p><p>
         Next, you need to create a new Subversion repository to hold the
         restored data.  This example shows an FSFS repository, but that is an
         arbitrary choice. You can restore from an FSFS backup into a FSFS
         repository or a BDB repository.  The Subversion dump format is
         <span class="quote">&#8220;<span class="quote">backend-agnostic</span>&#8221;</span>.
      </p><pre class="screen">
root:/tmp# svnadmin create --fs-type=fsfs testrepo
      </pre><p>
         Next, load the full backup into the repository:
      </p><pre class="screen">
root:/tmp# bzcat svndump-0:782-opt-svn-repo1.txt.bz2 | svnadmin load testrepo
      </pre><p>
         Of course, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span>,
         depending on what kind of compression is in use.
      </p><p>
         Follow that with loads for each of the incremental backups:
      </p><pre class="screen">
root:/tmp# bzcat svndump-783:785-opt-svn-repo1.txt.bz2 | svnadmin load testrepo
root:/tmp# bzcat svndump-786:800-opt-svn-repo1.txt.bz2 | svnadmin load testrepo
      </pre><p>
         Again, use <span class="command"><strong>zcat</strong></span> or just <span class="command"><strong>cat</strong></span> as
         appropriate.
      </p><p>
         When this is done, your repository will be restored to the point of
         the last commit indicated in the svndump file (in this case, to
         revision 800). 
      </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>
            Note: don't be surprised if, when you test this, the restored directory
            doesn't have exactly the same contents as the original directory.  I can't
            explain why this happens, but if you execute <span class="command"><strong>svnadmin dump</strong></span>
            on both old and new repositories, the results are identical.  This means that
            the repositories do contain the same content.
         </p></div><p>
         For more information on using Subversion, see the book
         <em class="citetitle">Version Control with Subversion</em> 
         (<a class="ulink" href="http://svnbook.red-bean.com/" target="_top">http://svnbook.red-bean.com/</a>) or the 
         <em class="citetitle">Subversion FAQ</em> 
         (<a class="ulink" href="http://subversion.tigris.org/faq.html" target="_top">http://subversion.tigris.org/faq.html</a>).
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-recovering-mbox"></a>Recovering Mailbox Data</h2></div></div></div><p>
         Mailbox data is gathered by the Cedar Backup mbox extension.
         Cedar Backup will create either full or incremental backups, but
         both kinds of backups are treated identically when restoring. 
      </p><p>
         Individual mbox files and mbox directories are treated a little
         differently, since individual files are just compressed, but
         directories are collected into a tar archive.
      </p><p>
         First, find the backup or backups you are interested in.  Typically,
         you will need the full backup from the first day of the week and each
         incremental backup from the other days of the week.  
      </p><p>
         The mbox extension creates files of the form
         <code class="filename">mbox-*</code>.  Backup files for individual mbox files might have a
         <code class="filename">.gz</code> or <code class="filename">.bz2</code> extension
         depending on what kind of compression you specified in configuration.
         Backup files for mbox directories will have a <code class="filename">.tar</code>,
         <code class="filename">.tar.gz</code> or <code class="filename">.tar.bz2</code> extension,
         again depending on what kind of compression you specified in configuration.
      </p><p>
         There is one backup file for each configured mbox file or directory.
         The backup file name represents the name of the file or directory and
         the date it was backed up.  So, the file
         <code class="filename">mbox-20060624-home-user-mail-greylist</code> represents
         the backup for <code class="filename">/home/user/mail/greylist</code> run on 24
         Jun 2006.  Likewise,
         <code class="filename">mbox-20060624-home-user-mail.tar</code> represents the
         backup for the <code class="filename">/home/user/mail</code> directory run on
         that same date.
      </p><p>
         Once you have found the files you are looking for, the restoration
         procedure is fairly simple.  First, concatenate all of the backup
         files together.  Then, use grepmail to eliminate duplicate messages
         (if any).  
      </p><p>
         Here is an example for a single backed-up file:
      </p><pre class="screen">
root:/tmp# rm restore.mbox # make sure it's not left over
root:/tmp# cat mbox-20060624-home-user-mail-greylist &gt;&gt; restore.mbox
root:/tmp# cat mbox-20060625-home-user-mail-greylist &gt;&gt; restore.mbox
root:/tmp# cat mbox-20060626-home-user-mail-greylist &gt;&gt; restore.mbox
root:/tmp# grepmail -a -u restore.mbox &gt; nodups.mbox
      </pre><p>
         At this point, <code class="filename">nodups.mbox</code> contains all of the
         backed-up messages from <code class="filename">/home/user/mail/greylist</code>.
      </p><p>
         Of course, if your backups are compressed, you'll have to use
         <span class="command"><strong>zcat</strong></span> or <span class="command"><strong>bzcat</strong></span> rather than just
         <span class="command"><strong>cat</strong></span>.
      </p><p>
         If you are backing up mbox directories rather than individual files,
         see the filesystem instructions for notes on now to extract the
         individual files from inside tar archives.  Extract the files you are
         interested in, and then concatenate them together just like shown
         above for the individual case.
      </p></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="cedar-recovering-split"></a>Recovering Data split by the Split Extension</h2></div></div></div><p>
         The Split extension takes large files and splits them up into smaller
         files. Typically, it would be used in conjunction with the
         <span class="command"><strong>cback-span</strong></span> command.
      </p><p>
         The split up files are not difficult to work with.  Simply find
         all of the files &#8212; which could be split between multiple
         discs &#8212; and concatenate them together.  
      </p><pre class="screen">
root:/tmp# rm usr-src-software.tar.gz  # make sure it's not there
root:/tmp# cat usr-src-software.tar.gz_00001 &gt;&gt; usr-src-software.tar.gz
root:/tmp# cat usr-src-software.tar.gz_00002 &gt;&gt; usr-src-software.tar.gz
root:/tmp# cat usr-src-software.tar.gz_00003 &gt;&gt; usr-src-software.tar.gz
      </pre><p>
         Then, use the resulting file like usual.
      </p><p>
         Remember, you need to have <span class="emphasis"><em>all</em></span> of the files that
         the original large file was split into before this will work.  If you
         are missing a file, the result of the concatenation step will be
         either a corrupt file or a truncated file (depending on which chunks
         you did not include).
      </p></div></div><div class="appendix"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-securingssh"></a>Appendix D. Securing Password-less SSH Connections</h1></div></div></div><div class="simplesect"><div class="titlepage"></div><p>
         Cedar Backup relies on password-less public key SSH connections to
         make various parts of its backup process work.  Password-less
         <span class="command"><strong>scp</strong></span> is used to stage files from remote clients to
         the master, and password-less <span class="command"><strong>ssh</strong></span> is used to
         execute actions on managed clients.  
      </p><p>
         Normally, it is a good idea to avoid password-less SSH connections in
         favor of using an SSH agent.  The SSH agent manages your SSH
         connections so that you don't need to type your passphrase over and
         over.  You get most of the benefits of a password-less connection
         without the risk.  Unfortunately, because Cedar Backup has to execute
         without human involvement (through a cron job), use of an agent really
         isn't feasable.  We have to rely on true password-less public keys to
         give the master access to the client peers.
      </p><p>
         Traditionally, Cedar Backup has relied on a <span class="quote">&#8220;<span class="quote">segmenting</span>&#8221;</span>
         strategy to minimize the risk.  Although the backup typically runs as
         root &#8212; so that all parts of the filesystem can be backed up
         &#8212; we don't use the root user for network connections.  Instead,
         we use a dedicated backup user on the master to initiate network
         connections, and dedicated users on each of the remote peers to accept
         network connections.
      </p><p>
         With this strategy in place, an attacker with access to the backup
         user on the master (or even root access, really) can at best only get
         access to the backup user on the remote peers.  We still concede a
         local attack vector, but at least that vector is restricted to an
         unprivileged user.
      </p><p>
         Some Cedar Backup users may not be comfortable with this risk, and
         others may not be able to implement the segmentation strategy &#8212;
         they simply may not have a way to create a login which is only used
         for backups.
      </p><p>
         So, what are these users to do?  Fortunately there is a solution.
         The SSH authorized keys file supports a way to put a <span class="quote">&#8220;<span class="quote">filter</span>&#8221;</span>
         in place on an SSH connection.  This excerpt is from the AUTHORIZED_KEYS FILE FORMAT
         section of man 8 sshd:
      </p><pre class="screen">
command="command"
   Specifies that the command is executed whenever this key is used for
   authentication.  The command supplied by the user (if any) is ignored.  The
   command is run on a pty if the client requests a pty; otherwise it is run
   without a tty.  If an 8-bit clean channel is required, one must not request
   a pty or should specify no-pty.  A quote may be included in the command by
   quoting it with a backslash.  This option might be useful to restrict
   certain public keys to perform just a specific operation.  An example might
   be a key that permits remote backups but nothing else.  Note that the client
   may specify TCP and/or X11 forwarding unless they are explicitly prohibited.
   Note that this option applies to shell, command or subsystem execution.
      </pre><p>
         Essentially, this gives us a way to authenticate the commands that are
         being executed.  We can either accept or reject commands, and we can
         even provide a readable error message for commands we reject.  The
         filter is applied on the remote peer, to the key that provides the
         master access to the remote peer.  
      </p><p>
         So, let's imagine that we have two hosts: master
         <span class="quote">&#8220;<span class="quote">mickey</span>&#8221;</span>, and peer <span class="quote">&#8220;<span class="quote">minnie</span>&#8221;</span>.  Here is the
         original <code class="filename">~/.ssh/authorized_keys</code> file for the
         backup user on minnie (remember, this is all on one line in the file):
      </p><pre class="screen">
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAxw7EnqVULBFgPcut3WYp3MsSpVB9q9iZ+awek120391k;mm0c221=3=km
=m=askdalkS82mlF7SusBTcXiCk1BGsg7axZ2sclgK+FfWV1Jm0/I9yo9FtAZ9U+MmpL901231asdkl;ai1-923ma9s=9=
1-2341=-a0sd=-sa0=1z= backup@mickey
      </pre><p>
         This line is the public key that minnie can use to identify the backup
         user on mickey.  Assuming that there is no passphrase on the private
         key back on mickey, the backup user on mickey can get direct access to
         minnie.
      </p><p>
         To put the filter in place, we add a command option to the key,
         like this:
      </p><pre class="screen">
command="/opt/backup/validate-backup" ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAxw7EnqVULBFgPcut3WYp
3MsSpVB9q9iZ+awek120391k;mm0c221=3=km=m=askdalkS82mlF7SusBTcXiCk1BGsg7axZ2sclgK+FfWV1Jm0/I9yo9F
tAZ9U+MmpL901231asdkl;ai1-923ma9s=9=1-2341=-a0sd=-sa0=1z= backup@mickey
      </pre><p>
         Basically, the command option says that whenever this key is used
         to successfully initiate a connection, the
         <span class="command"><strong>/opt/backup/validate-backup</strong></span> command will be run
         <span class="emphasis"><em>instead of</em></span> the real command that came over the
         SSH connection.  Fortunately, the interface gives the command access
         to certain shell variables that can be used to invoke the original
         command if you want to.
      </p><p>
         A very basic <span class="command"><strong>validate-backup</strong></span> script might look
         something like this:
      </p><pre class="screen">
#!/bin/bash
if [[ "${SSH_ORIGINAL_COMMAND}" == "ls -l" ]] ; then
    ${SSH_ORIGINAL_COMMAND}
else
   echo "Security policy does not allow command [${SSH_ORIGINAL_COMMAND}]."
   exit 1
fi
      </pre><p>
         This script allows exactly <span class="command"><strong>ls -l</strong></span> and nothing else.
         If the user attempts some other command, they get a nice error message
         telling them that their command has been disallowed.  
      </p><p>
         For remote commands executed over <span class="command"><strong>ssh</strong></span>, the original
         command is exactly what the caller attempted to invoke.  For remote
         copies, the commands are either <span class="command"><strong>scp -f file</strong></span> (copy
         <span class="emphasis"><em>from</em></span> the peer to the master) or <span class="command"><strong>scp -t
         file</strong></span> (copy <span class="emphasis"><em>to</em></span> the peer from the
         master).  
      </p><p>
         If you want, you can see what command SSH thinks it is executing by
         using <span class="command"><strong>ssh -v</strong></span> or <span class="command"><strong>scp -v</strong></span>.  The
         command will be right at the top, something like this:
      </p><pre class="screen">
Executing: program /usr/bin/ssh host mickey, user (unspecified), command scp -v -f .profile
OpenSSH_4.3p2 Debian-9, OpenSSL 0.9.8c 05 Sep 2006
debug1: Reading configuration data /home/backup/.ssh/config
debug1: Applying options for daystrom
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug2: ssh_connect: needpriv 0
      </pre><p>
         Omit the <span class="command"><strong>-v</strong></span> and you have your command: <span class="command"><strong>scp
         -f .profile</strong></span>.
      </p><p>
         For a normal, non-managed setup, you need to allow the following
         commands, where <code class="filename">/path/to/collect/</code> is replaced
         with the real path to the collect directory on the remote peer:
      </p><pre class="screen">
scp -f /path/to/collect/cback.collect
scp -f /path/to/collect/*
scp -t /path/to/collect/cback.stage
      </pre><p>
         If you are configuring a managed client, then you also need to list
         the exact command lines that the master will be invoking on the
         managed client.  You are guaranteed that the master will invoke one
         action at a time, so if you list two lines per action (full and
         non-full) you should be fine.  Here's an example for the collect
         action:
      </p><pre class="screen">
/usr/bin/cback --full collect
/usr/bin/cback collect
      </pre><p>
         Of course, you would have to list the actual path to the
         <span class="command"><strong>cback</strong></span> executable &#8212; exactly the one listed in
         the &lt;cback_command&gt; configuration option for your managed peer.
      </p><p>
         I hope that there is enough information here for interested users to
         implement something that makes them comfortable.  I have resisted
         providing a complete example script, because I think everyone's setup
         will be different.  However, feel free to write if you are working
         through this and you have questions.
      </p></div></div><div class="appendix"><div class="titlepage"><div><div><h1 class="title"><a name="cedar-copyright"></a>Appendix E. Copyright</h1></div></div></div><div class="simplesect"><div class="titlepage"></div><pre class="programlisting">

Copyright (c) 2004-2011,2013-2015
Kenneth J. Pronovici

This work is free; you can redistribute it and/or modify it under
the terms of the GNU General Public License (the "GPL"), Version 2,
as published by the Free Software Foundation.

For the purposes of the GPL, the "preferred form of modification"
for this work is the original Docbook XML text files.  If you
choose to distribute this work in a compiled form (i.e. if you
distribute HTML, PDF or Postscript documents based on the original
Docbook XML text files), you must also consider image files to be
"source code" if those images are required in order to construct a
complete and readable compiled version of the work.

This work is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Copies of the GNU General Public License are available from
the Free Software Foundation website, http://www.gnu.org/.
You may also write the Free Software Foundation, Inc., 
51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA

====================================================================

		    GNU GENERAL PUBLIC LICENSE
		       Version 2, June 1991

 Copyright (C) 1989, 1991 Free Software Foundation, Inc.
     51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

			    Preamble

  The licenses for most software are designed to take away your
freedom to share and change it.  By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users.  This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it.  (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.)  You can apply it to
your programs, too.

  When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.

  To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.

  For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have.  You must make sure that they, too, receive or can get the
source code.  And you must show them these terms so they know their
rights.

  We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.

  Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software.  If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.

  Finally, any free program is threatened constantly by software
patents.  We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary.  To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.

  The precise terms and conditions for copying, distribution and
modification follow.

		    GNU GENERAL PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License.  The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language.  (Hereinafter, translation is included without limitation in
the term "modification".)  Each licensee is addressed as "you".

Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope.  The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.

  1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.

You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.

  2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

    a) You must cause the modified files to carry prominent notices
    stating that you changed the files and the date of any change.

    b) You must cause any work that you distribute or publish, that in
    whole or in part contains or is derived from the Program or any
    part thereof, to be licensed as a whole at no charge to all third
    parties under the terms of this License.

    c) If the modified program normally reads commands interactively
    when run, you must cause it, when started running for such
    interactive use in the most ordinary way, to print or display an
    announcement including an appropriate copyright notice and a
    notice that there is no warranty (or else, saying that you provide
    a warranty) and that users may redistribute the program under
    these conditions, and telling the user how to view a copy of this
    License.  (Exception: if the Program itself is interactive but
    does not normally print such an announcement, your work based on
    the Program is not required to print an announcement.)

These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.

In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

  3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:

    a) Accompany it with the complete corresponding machine-readable
    source code, which must be distributed under the terms of Sections
    1 and 2 above on a medium customarily used for software interchange; or,

    b) Accompany it with a written offer, valid for at least three
    years, to give any third party, for a charge no more than your
    cost of physically performing source distribution, a complete
    machine-readable copy of the corresponding source code, to be
    distributed under the terms of Sections 1 and 2 above on a medium
    customarily used for software interchange; or,

    c) Accompany it with the information you received as to the offer
    to distribute corresponding source code.  (This alternative is
    allowed only for noncommercial distribution and only if you
    received the program in object code or executable form with such
    an offer, in accord with Subsection b above.)

The source code for a work means the preferred form of the work for
making modifications to it.  For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable.  However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.

If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.

  4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License.  Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.

  5. You are not required to accept this License, since you have not
signed it.  However, nothing else grants you permission to modify or
distribute the Program or its derivative works.  These actions are
prohibited by law if you do not accept this License.  Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.

  6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions.  You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.

  7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all.  For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.

If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices.  Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.

  8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded.  In such case, this License incorporates
the limitation as if written in the body of this License.

  9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time.  Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

Each version is given a distinguishing version number.  If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation.  If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.

  10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission.  For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this.  Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.

			    NO WARRANTY

  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.

  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

		     END OF TERMS AND CONDITIONS

====================================================================

      </pre></div></div></div></body></html>
cedar-backup2-doc 2.26.5-3 / usr / share / doc / cedar-backup2-doc / manual / manual.html