This file is indexed.

/usr/share/doc/refdb/refdb-manual/ch04s03.html is in refdb-doc 1.0.2-3ubuntu1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Things to know before you start</title><link rel="stylesheet" type="text/css" href="manual.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="home" href="index.html" title="RefDB handbook" /><link rel="up" href="ch04.html" title="Chapter 4. Installation" /><link rel="prev" href="ch04s02.html" title="Upgrading from an older version" /><link rel="next" href="ch04s04.html" title="Installation on Linux and other Unix variants" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Things to know before you start</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ch04s02.html">Prev</a> </td><th width="60%" align="center">Chapter 4. Installation</th><td width="20%" align="right"> <a accesskey="n" href="ch04s04.html">Next</a></td></tr></table><hr /></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="idp63813920"></a>Things to know before you start</h2></div></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp63814624"></a>Which database server?</h3></div></div></div><p>RefDB currently supports MySQL and PostgreSQL as external database servers as well as SQLite as an embedded database engine. This section tries to help you decide which one to pick.</p><p>The first issue is whether you want to run an external database server or not. External database servers scale better if many users share databases and they provide access control. The external database servers also use more fine-grained locking mechanisms which allow concurrent read and write accesses, whereas the SQLite engine will lock the entire database for write accesses. However, the latter does not provide access control and thus doesn't require any sort of user administration.</p><p><strong>Rule #1. </strong>If you don't intend to share databases, or if running a database server scares you in any way, then you may better off with SQLite.</p><p>Another issue is the way how the database engines store their data. SQLite is unique in that it uses a single architecture-independent file per database which makes transferring the data to a different box a breeze. The external database engines use more sophisticated ways to organize their data, but you need some basic administrative skills in order to replicate the data.</p><p><strong>Rule #2. </strong>If you cannot rely on remote access to your databases (something which RefDB is well suited for) but have to take your data physically with you while travelling, SQLite is a better choice.</p><p>Now some words about the external database servers. As with many other fundamental schisms in the Unix world (vi vs. Emacs, KDE vs. Gnome, to name a few), both database servers supported by RefDB have followers who are semi-religious about their choice. Both MySQL and PostgreSQL are robust and well-proven. This leads us to:</p><p><strong>Rule #3. </strong>If you already use one of the servers, then by all means use it also for RefDB. Being familiar with the server and having it happily running usually outweighs any advantages that the other server might have.</p><p>But what if you do not yet run a suitable database server? You can browse the web and read for hours about the differences between MySQL and PostgreSQL, but for the purpose of managing RefDB reference databases it boils down to one essential difference: MySQL is faster.</p><p>This leads us to:</p><p><strong>Rule #4. </strong>If you cherish speed over anything else, use MySQL.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>There's a few more differences that you should be aware of: PostgreSQL has transaction support by default. MySQL supports transactions only if you use InnoDB tables. If you want this additional peace of mind from MySQL, make sure InnoDB is the default table type. SQLite does not support Unix-style regular expressions. If you'd like to use these more versatile expressions instead of the simpler SQL regular expressions supported by SQLite, choose MySQL or PostgreSQL.</p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp64624832"></a>Where do the components go?</h3></div></div></div><p>As RefDB is a three-tiered client-server application, you have considerable freedom to distribute the components among your computers. Although RefDB shines in a network environment, there is absolutely no problem to run all components on a single standalone workstation.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>Please keep in mind that there's one tier less if you choose the SQLite embedded database engine. The databases will always be on the filesystem of the machine that runs refdbd (this doesn't exclude putting the files on an NFS share if you have a good reason to do so).</p></div><p>The basic idea of the client-server model has several implications:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>Many workstations can access a single server running the database server. Thus many people can access the same databases without the pain of duplicating the data and the database engine on every single machine.</p></li><li class="listitem"><p>A considerable part of the computing effort is done outside of the workstations. Therefore even rather lame workstations may be sufficient to access and manipulate the data. The database server should run on a decent machine, though (better not that dusty 486 that has doubled as a paperweight since 1990).</p></li><li class="listitem"><p>Updates of the software will mainly affect the database server and the application server. This considerably reduces your workload, as the workstations need to be updated less frequently.</p></li></ul></div><p>The most common scenarios for using RefDB will be on a department or institute network and on a standalone workstation. Let's see how these scenarios differ:</p><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="idp64632320"></a>Installation on a standalone workstation</h4></div></div></div><p>This is obviously the simplest case. The clients, the application server, the database server, and the databases reside on the same physical machine (see <a class="xref" href="ch04s03.html#figure-standalone" title="Figure 4.1. RefDB on a standalone workstation">Figure 4.1, “RefDB on a standalone workstation”</a>). The only requirement for the workstation is that a TCP/IP network is installed. This is necessary as the three layers of RefDB always communicate via TCP/IP sockets. The IP address 127.0.0.1 has to be specified in the configuration files of the clients and of the application server.</p><div class="figure"><a id="figure-standalone"></a><p class="title"><strong>Figure 4.1. RefDB on a standalone workstation</strong></p><div class="figure-contents"><div class="mediaobject"><img src="refdbmanualfig2.png" alt="RefDB on a standalone workstation" /></div></div></div><br class="figure-break" /></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="idp64639680"></a>Installation in a network</h4></div></div></div><p>In a network you can take advantage of the client-server model and distribute the workload between your computers. Although the three layers can well be distributed between three physical machines, it may be more useful to install the application server on the same machine as the database server and the databases (see <a class="xref" href="ch04s03.html#figure-network" title="Figure 4.2. RefDB on a network">Figure 4.2, “RefDB on a network”</a>). A dedicated or general-purpose server may be most suitable to hold these components, as a workstation may get sluggish if it has to answer a lot of database requests.</p><p>The clients as well as scripts and support files have to be installed on all workstations that will be used to access the databases. The client for administrative tasks, <span class="application">refdba</span>, can be restricted to the workstations of system administrators or otherwise experienced staff.</p><div class="figure"><a id="figure-network"></a><p class="title"><strong>Figure 4.2. RefDB on a network</strong></p><div class="figure-contents"><div class="mediaobject"><img src="refdbmanualfig3.png" alt="RefDB on a network" /></div></div></div><br class="figure-break" /></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="sect1-mystery-init-files"></a>The mystery of the configuration files</h3></div></div></div><p>Like with most Unix-style software packages, the behaviour of the RefDB applications can be tweaked by configuration files. Wherever it makes sense, there is one global config file with useful admin-picked defaults, and another user config file for the individual user to play with. The purpose of the configuration files is to set some reasonable default values for the command-line switches of the RefDB programs. Once you have set these, you will never have to specify these values on the command line again, unless you want to temporarily override them.</p><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="idp64650624"></a>Types of configuration files</h4></div></div></div><p>All RefDB applications and scripts that use configuration files (these are the server <a class="link" href="ch12.html" title="Chapter 12. The application server">refdbd</a>, the clients <a class="link" href="ch14.html" title="Chapter 14. Tools for reference and notes management">refdbc</a>, <a class="link" href="ch15.html" title="Chapter 15. Tools for bibliographies">refdbib</a>, <a class="link" href="ch13.html" title="Chapter 13. Administration tools">refdba</a>, the script <a class="link" href="re25.html#refdbxml-configuration" title="Configuration">refdbxml</a>, as well as the conversion filters <a class="link" href="re12.html" title="bib2ris">bib2ris</a>, <a class="link" href="re13.html" title="db2ris">db2ris</a>, <a class="link" href="re16.html" title="med2ris">med2ris.pl</a>, <a class="link" href="re15.html" title="marc2ris">marc2ris</a>, and <a class="link" href="re14.html" title="en2ris">en2ris</a>) can use two configuration files each. One global configuration file is supplied by the system administrator and can be used to set values that are common for all users on that box, like the IP address of the application server. Another file can be used by every user to supply the values that were not set in the global file or to override settings in this file. The users' copies can have a leading dot to hide the files (the refdb programs will first try to read a hidden configuration file, and only if that cannot be found they try to read a non-hidden file).</p><p>bib2ris, marc2ris, and med2ris use a second global configuration file if they are run as a CGI applications. A local configuration file does not make sense in this case.</p><p>The default location for the global configuration files is <code class="filename">/usr/local/etc/refdb</code>. There are two ways to change this. If you compile RefDB from the sources you can specify a different directory with the <code class="option">--prefix</code> or <code class="option">--sysconfdir</code> options of <span class="command"><strong>./configure</strong></span>. E.g. if you specify <code class="option">--sysconfdir=/etc</code>, then the configuration files will be installed in <code class="filename">/etc/refdb</code> (the <code class="filename">refdb</code> part is automatically appended by the RefDB install routines). If you use precompiled binaries, use the <code class="option">-y</code> command line option to specify the directory. In this case you have to specify the full path, i.e. <code class="filename">/etc/refdb</code> to read the configuration files installed by the previous example.</p><p>The user copies of the client configuration files are expected to be in the users' home directories as specified by the environment variable <code class="envar">HOME</code>.</p></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="idp64668176"></a>Configuration file syntax</h4></div></div></div><p>All configuration files share a common syntax. There are just three essential things to know:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>All information is stored as pairs of whitespace-separated items, one pair on each line. The first item on the line specifies the variable name, the second item specifies the variable value. Whitespace means one or more spaces or tabs in any combination.</p></li><li class="listitem"><p>Everything to the right of a hash sign (#) is a comment. The rest of the line is ignored.</p></li><li class="listitem"><p>The line endings are Unix-style (0x10, not DOS-style 0x13 0x10), regardless of the operating system.</p></li></ul></div></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="idp64673152"></a>A configuration example</h4></div></div></div><p>The whole configuration stuff may sound a bit confusing, so let us now look at a simple configuration example that illustrates the principles laid out above.</p><p>The following is a listing of <code class="filename">/usr/local/etc/refdb/refdbcrc</code>, our global refdbc configuration file in this example:</p><pre class="programlisting"># This is the global configuration file for refdbc
serverip	127.0.0.1
port	9734
pager	more
timeout	180
# end of refdbcrc
	</pre><p>This is the corresponding copy that one of the users of the system created as <code class="filename">/home/joe/.refdbcrc</code>:</p><pre class="programlisting"># This is the user configuration file for refdbc
pager	less
username	joesixpack
passwd  * 
timeout	30
# end of .refdbcrc
	</pre><p>As you can see our hypothetical system administrator configured the IP address (<code class="varname">serverip</code>) and the <code class="varname">port</code> where refdbd listens to the client requests. This value is most likely the same for all users on the system, so this is nothing to worry about for the users. <code class="filename">more</code> is defined as the default <code class="varname">pager</code>, and the <code class="varname">timeout</code> is set to 3 minutes.</p><p>Joe Sixpack, our reckless user, does not like <code class="filename">more</code> as a pager and prefers to use <code class="filename">less</code> instead. He also thinks that half a minute as a timeout should be enough. Both of these settings override the corresponding values in the global file. <code class="varname">serverip</code> and <code class="varname">port</code> are not redefined in the user's copy, so the values of the global file take effect. Joe also defined <code class="varname">username</code> (which happens to be different from his login name "joe") and <code class="varname">passwd</code> so the correct values will be used for the database access (the asterisk in the <code class="varname">passwd</code> field will cause refdbc to ask for the password interactively for security reasons).</p></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="idp64686528"></a>Configuration file variables</h4></div></div></div><p>For a listing of available configuration file variables please see the tables for <a class="link" href="re06.html#refdba-configuration" title="Configuration">refdba</a>, <a class="link" href="re11.html#refdbc-configuration" title="Configuration">refdbc</a>, <a class="link" href="re20.html#refdbib-configuration" title="Configuration">refdbib</a>, <a class="link" href="re02.html#refdbd-configuration" title="Configuration">refdbd</a>, <a class="link" href="re25.html#refdbxml-configuration" title="Configuration">refdbxml</a>, <a class="link" href="re12.html#bib2ris-configuration" title="Configuration">bib2ris</a>, <a class="link" href="re13.html#db2ris-configuration" title="Configuration">db2ris</a>, <a class="link" href="re16.html#med2ris-configuration" title="Configuration">med2ris</a>, <a class="link" href="re15.html#marc2ris-configuration" title="Configuration">marc2ris</a>, and <a class="link" href="re14.html#en2ris-configuration" title="Configuration">en2ris</a>.</p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp64695312"></a>Environment variables</h3></div></div></div><p>refdb uses the following environment variables to locate the files and directories it needs to run properly.</p><div class="variablelist"><dl class="variablelist"><dt><span class="term">HOME</span></dt><dd><p>This variable should be set for all users anyway. It is used to locate the personal <a class="link" href="ch04s03.html#sect1-mystery-init-files" title="The mystery of the configuration files">configuration files</a> for the RefDB clients.</p></dd><dt><span class="term">SGML_CATALOG_FILES</span></dt><dd><p>If you process SGML files, this variable will be consulted to locate the catalog files required for resolving public identifiers to their local filename equivalents.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>On some systems, the package system maintains a master catalog whose path is hard-coded into the SGML applications. In this case, the variable is not required.</p></div></dd><dt><span class="term">XML_CATALOG_FILES</span></dt><dd><p>If you process XML files, this variable may be consulted to locate XML catalogs. If this variable is not set, many tools look into the default location <code class="filename">/etc/xml/catalog</code> instead. Remember that some XSLT processors need access to additional Java classes to provide XML catalog support at all.</p></dd></dl></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="idp64705536"></a>Some notes on the filesystem</h3></div></div></div><p>The default installation procedure will install the RefDB files in locations compatible with the filesystem hierarchy standard. You will learn in the following sections how to change where the RefDB files will be installed if you want to adapt the installation to specific needs of your system. To get a better idea of what you have to take care of if you don't like the defaults, here is a list of the directories used by RefDB:</p><div class="variablelist"><dl class="variablelist"><dt><span class="term">
	    <code class="filename">/usr/local/bin</code>
	  </span></dt><dd><p>This directory will receive all binary files and shell scripts.</p></dd><dt><span class="term">
	    <code class="filename">/usr/local/etc/refdb</code>
	  </span></dt><dd><p>All global RefDB configuration files end up in this directory.</p></dd><dt><span class="term">
	    <code class="filename">/usr/local/share/refdb</code>
	  </span></dt><dd><p>This directory contains shareable, operating system independent files. The files are organized in a couple of subdirectories:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><code class="filename">css</code> contains a cascading stylesheet suitable for the HTML output of the <a class="link" href="re11.html#app-c-command-getref" title="getref"><span class="command"><strong>getref</strong></span></a> command.</p></li><li class="listitem"><p><code class="filename">declarations</code> contains the default SGML declarations.</p></li><li class="listitem"><p><code class="filename">dsssl</code> contains DSSSL stylesheets.</p></li><li class="listitem"><p><code class="filename">dtd</code> contains the document type definitions used by RefDB.</p></li><li class="listitem"><p><code class="filename">examples</code> contains a few example reference data files as well as SGML and XML test documents using RefDB citations.</p></li><li class="listitem"><p><code class="filename">sql</code> contains SQL scripts used to initialize databases.</p></li><li class="listitem"><p><code class="filename">sru</code> contains the XSLT and CSS stylesheets required to set up the SRU service</p></li><li class="listitem"><p><code class="filename">styles</code> contains some XML files containing bibliography styles.</p></li><li class="listitem"><p><code class="filename">xsl</code> contains XSLT stylesheets.</p></li></ul></div></dd><dt><span class="term">
	    <code class="filename">/usr/local/var/lib/refdb/db</code>
	  </span></dt><dd><p>holds the database files of embedded database engines and a version file for use by package installation scripts</p></dd></dl></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ch04s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="ch04.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="ch04s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Upgrading from an older version </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Installation on Linux and other Unix variants</td></tr></table></div></body></html>