/usr/share/doc/refdb/refdb-manual/ch11s02.html

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>SRU Operations</title><link rel="stylesheet" type="text/css" href="manual.css" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="home" href="index.html" title="RefDB handbook" /><link rel="up" href="ch11.html" title="Chapter 11. RefDB SRU interface" /><link rel="prev" href="ch11.html" title="Chapter 11. RefDB SRU interface" /><link rel="next" href="pt04.html" title="Part IV. Reference manual" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">SRU Operations</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ch11.html">Prev</a> </td><th width="60%" align="center">Chapter 11. RefDB SRU interface</th><td width="20%" align="right"> <a accesskey="n" href="pt04.html">Next</a></td></tr></table><hr /></div><div class="sect1"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="sect-sru-operations"></a>SRU Operations</h2></div></div></div><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>This section assumes that you run the SRU service using the <a class="link" href="ch04s10.html#sect-srucgi" title="Setting up SRU support as a CGI program">CGI application</a>. If you use the standalone server instead, please adapt the URLs by replacing "http://mybox.com/cgi-bin/" with "http://localhost:8080/".</p></div><p>SRU defines three operations, all of which return XML documents:</p><div class="variablelist"><dl class="variablelist"><dt><span class="term">explain</span></dt><dd><p>describes the available facilities in terms of record schemas, available indexes and so on. Sort of a cheat sheet. The original specification is <a class="ulink" href="http://www.loc.gov/standards/sru/explain/" target="_top">here</a>.</p></dd><dt><span class="term">searchRetrieve</span></dt><dd><p>performs a database query and retrieves the matching datasets. The original specification is <a class="ulink" href="http://www.loc.gov/standards/sru/sru-spec.html" target="_top">here</a>.</p></dd><dt><span class="term">scan</span></dt><dd><p>retrieves a list of matching search terms for later use in a searchRetrieve operation. The original specification is <a class="ulink" href="http://www.loc.gov/standards/sru/scan/index.html" target="_top">here</a>.</p></dd></dl></div><p>You are encouraged to peruse the linked specifications above to learn the general principles. The following sections build on this knowledge and describe the RefDB SRU interface with a focus on its peculiarities and limitations. We'll assume that your web server is set up to run the <code class="filename">refdbsru</code> CGI script using the following URL: http://mybox.com/cgi-bin/refdbsru/.</p><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="sect-sru-explain"></a>The explain operation</h3></div></div></div><p>The explain operation is the simplest of all and a good start to introduce the syntax of the SRU interface. RefDB fully supports the explain operation. Any of the following URLs typed into your browser will run it:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>http://mybox.com/cgi-bin/refdbsru/</p></li><li class="listitem"><p>http://mybox.com/cgi-bin/refdbsru/?</p></li><li class="listitem"><p>http://mybox.com/cgi-bin/refdbsru/?operation=explain&amp;version=1.1</p></li></ul></div><p>The URL part following the question mark ("?") in the third example is the search-part which consists of "parameter=value" pairs glued together with ampersands ("&amp;"). Both parameters shown here are mandatory for all SRU operations as we'll see shortly.</p><p>The query will return a XML document describing the capabilities of the RefDB SRU interface.</p></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="sect-sru-searchretrieve"></a>The searchRetrieve operation</h3></div></div></div><p>The searchRetrieve operation is the one used to actually get hold of the reference data you're looking for. Your query is sent in the <em class="parameter"><code>query</code></em> parameter which is mandatory for this operation. A few examples:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>http://mybox.com/cgi-bin/refdbsru/?operation=searchRetrieve&amp;version=1.1&amp;recordSchema=mods&amp;query=bib.name%3d%22Miller,Henry J.%22</p></li><li class="listitem"><p>http://mybox.com/cgi-bin/refdbsru/?operation=searchRetrieve&amp;version=1.1&amp;recordSchema=risx&amp;query=dc.subject%3d%22circular dichroism%22+or+dc.subject%3d%22NMR%22</p></li></ul></div><p>The first example requests the bibliographic data in MODS format. The query proper reads 'bib.name="Miller, Henry J."' and translates to a search for all references where a person with that name is listed as an author, editor, or series editor. The second example requests the data in risx format and searches all references with the keywords "circular dichroism" or "NMR".</p><p>Both examples make use of percent encoding to make the URL string conform to the specs. This is further discussed <a class="link" href="ch11s02.html#sect-sru-searchretrieve-query-encoding" title="Encoding">below</a>.</p><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="sect-sru-searchretrieve-query"></a>The query parameter</h4></div></div></div><p>The query parameter describes the criteria of your database query and is a string using the Common Query Language.</p><div class="sect4"><div class="titlepage"><div><div><h5 class="title"><a id="sect-sru-searchretrieve-query-conformance"></a>Conformance</h5></div></div></div><p>The RefDB SRU support conforms to <a class="ulink" href="http://www.loc.gov/standards/sru/cql/index.html#conformance" target="_top">CQL Level 2</a>. The following general restrictions apply:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>RefDB does not support persistent result sets. Therefore, the <em class="parameter"><code>resultSetTTL</code></em> request parameter is meaningless, and it is not possible to reference a result set in a subsequent query.</p></li><li class="listitem"><p>RefDB does not support XPath expressions to modify the results. Therefore the <em class="parameter"><code>recordXPath</code></em> request parameter is not honored. You can of course apply any XPath expressions on the client side using an appropriate processor.</p></li><li class="listitem"><p>Sorting is currently not supported, and the <em class="parameter"><code>sortKeys</code></em> parameter is not applicable. Data will always be sorted by ID</p></li><li class="listitem"><p>The <em class="parameter"><code>recordPacking</code></em> parameter is not supported. Records are always returned as XML.</p></li><li class="listitem"><p>RefDB does not support relation modifiers and boolean modifiers in CQL queries.</p></li><li class="listitem"><p><em class="parameter"><code>prox</code></em> is not supported as a boolean operator.</p></li><li class="listitem"><p>The relation <em class="parameter"><code>encloses</code></em> is not supported</p></li><li class="listitem"><p>The support for regular expressions ("masking" in CQL) depends on the database backend. Most notably, anchoring is not supported by SQLite and SQLite3.</p></li></ul></div></div><div class="sect4"><div class="titlepage"><div><div><h5 class="title"><a id="sect-sru-searchretrieve-query-defaults"></a>Defaults</h5></div></div></div><p>If a query or a query part does not specify an index and a relation, RefDB looks for the term in the author, keyword, and title indexes:</p><p>http://mybox.com/cgi-bin/refdbsru/?operation=searchRetrieve&amp;version=1.1&amp;query=cat</p><p>This query will try to find references that contain the string "cat" in either the title, a keyword, or an author name.</p></div><div class="sect4"><div class="titlepage"><div><div><h5 class="title"><a id="sect-sru-searchretrieve-query-contextset"></a>Context sets</h5></div></div></div><p>RefDB supports the context sets <a class="ulink" href="http://www.loc.gov/standards/sru/cql/dc-context-set.html" target="_top">Dublin Core</a> (dc) and the not yet officially released <a class="ulink" href="http://www.loc.gov/standards/sru/cql-bibliographic-searching.html" target="_top">CQL Bibliographic Searching</a> (bib). The following table lists the relationship of the indexes defined in these context sets with the RefDB fields.</p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>RefDB of course implicitly also supports the cql context set.</p></div><div class="table"><a id="idp66013376"></a><p class="title"><strong>Table 11.1. Context sets</strong></p><div class="table-contents"><table class="table" summary="Context sets" border="1"><colgroup><col /><col /><col /><col /><col /></colgroup><thead><tr><th>dc index</th><th>bib index</th><th>RefDB field</th><th>search/scan?</th><th>description</th></tr></thead><tbody><tr><td>title</td><td>title</td><td>TX</td><td>y/n</td><td>item titles</td></tr><tr><td> </td><td>seriesTitle</td><td>T3</td><td>y/n</td><td>series title</td></tr><tr><td> </td><td>titleAbbrev</td><td>JA</td><td>y/y</td><td>journal title, abbreviated</td></tr><tr><td>creator, contributor</td><td>name, namePersonal, nameCorporate</td><td>AX</td><td>y/y</td><td>authors and editors</td></tr><tr><td>subject, coverage</td><td>subject</td><td>KW</td><td>y/y</td><td>keywords</td></tr><tr><td>date</td><td>dateIssued</td><td>PY</td><td>y/n</td><td>publication date</td></tr><tr><td> </td><td>volume</td><td>VL</td><td>y/n</td><td>periodical volume</td></tr><tr><td> </td><td>issue</td><td>IS</td><td>y/n</td><td>periodical issue</td></tr><tr><td> </td><td>startPage</td><td>SP</td><td>y/n</td><td>start page</td></tr><tr><td> </td><td>endPage</td><td>EP</td><td>y/n</td><td>end page</td></tr><tr><td>publisher</td><td> </td><td>PB</td><td>y/n</td><td>publisher</td></tr></tbody></table></div></div><br class="table-break" /></div><div class="sect4"><div class="titlepage"><div><div><h5 class="title"><a id="sect-sru-searchretrieve-query-encoding"></a>Encoding</h5></div></div></div><p>As you may have noticed, it is necessary to <a class="ulink" href="http://rfc.net/rfc3986.html#s2.1." target="_top">percent-encode</a> a few special characters in the parameter <span class="emphasis"><em>values</em></span>. E.g. the equal sign ("=") assigns the values to the parameters and does not have to be encoded. However, equal signs within the CQL query string (which is the value of the <em class="parameter"><code>query</code></em> parameter) must be percent-encoded. If you use a dedicated client to run your queries, you should not have to care about these conversions. If you use a web-browser or a similar device, you may find the following conversion table useful:</p><div class="table"><a id="idp65780512"></a><p class="title"><strong>Table 11.2. Percent-encoding special characters</strong></p><div class="table-contents"><table class="table" summary="Percent-encoding special characters" border="1"><colgroup><col /><col /><col /><col /></colgroup><thead><tr><th>replace</th><th>with</th><th>replace</th><th>with</th></tr></thead><tbody><tr><td>:</td><td>%3a</td><td>/</td><td>%2f</td></tr><tr><td>?</td><td>%3f</td><td>#</td><td>%23</td></tr><tr><td>[</td><td>%5B</td><td>]</td><td>%5D</td></tr><tr><td>@</td><td>%40</td><td>!</td><td>%21</td></tr><tr><td>$</td><td>%24</td><td>&amp;</td><td>%26</td></tr><tr><td>'</td><td>%27</td><td>(</td><td>%28</td></tr><tr><td>)</td><td>%29</td><td>*</td><td>%2a</td></tr><tr><td>+</td><td>%2b</td><td>,</td><td>%2c</td></tr><tr><td>;</td><td>%3b</td><td>=</td><td>%3d</td></tr><tr><td>%</td><td>%25</td><td>"</td><td>%22</td></tr></tbody></table></div></div><br class="table-break" /></div></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="sect-sru-searchretrieve-query-schemas"></a>Schemas</h4></div></div></div><p>RefDB can return the datasets using two different XML schemas which you can request with the <em class="parameter"><code>recordSchema</code></em> parameter:</p><div class="variablelist"><dl class="variablelist"><dt><span class="term">MODS</span></dt><dd><p><a class="ulink" href="http://www.loc.gov/standards/mods/" target="_top">MODS</a> is a schema for bibliographic data in library applications. Use 'mods' as the parameter value. The returned datasets will use 'mods' as the namespace prefix. MODS is the default if you do not specify a schema.</p></dd><dt><span class="term">risx</span></dt><dd><p>This is RefDB's default XML input and output format. Use 'risx' as the parameter value to request risx. The datasets will use 'risx' as the namespace prefix.</p></dd></dl></div></div><div class="sect3"><div class="titlepage"><div><div><h4 class="title"><a id="sect-sru-select-database"></a>Databases</h4></div></div></div><p>SRU assumes that the base URL of the SRU service (the one you enter to get an <a class="link" href="ch11s02.html#sect-sru-explain" title="The explain operation">explain</a> response) corresponds to one database. Instead of using several copies of the CGI script to service more than one database, <a class="link" href="re04.html" title="refdbsru">refdbsru</a> allows to specify the name of a database in the additional path information of the URL. Compare the following (pseudo-)URLs:</p><pre class="programlisting">http://myserver.com/cgi-bin/refdbsru/?&lt;query&gt;
http://myserver.com/cgi-bin/refdbsru/foo?&lt;query&gt;
	</pre><p>The first URL will use the default database. The second URL will use the database "foo" instead. The database name goes between the slash that follows the CGI script name and the question mark that opens the query string.</p></div></div><div class="sect2"><div class="titlepage"><div><div><h3 class="title"><a id="sect-sru-scan"></a>The scan operation</h3></div></div></div><p>The purpose of the scan operation is to provide a matching list of query terms, along with the number of references each term would retrieve. This is similar to browsing through a stack of library cards with subjects or author names on them. The RefDB SRU service allows to scan the following database fields:</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>keywords (bib.subject)</p><p>http://mybox.com/cgi-bin/refdbsru/?operation=scan&amp;version=1.1&amp;scanClause=bib.subject%3d%22dichroism%22</p></li><li class="listitem"><p>author names (bib.name)</p><p>http://mybox.com/cgi-bin/refdbsru/?operation=scan&amp;version=1.1&amp;scanClause=bib.name%3d%22Henry J.%22</p></li><li class="listitem"><p>journal abbreviations (bib.titleAbbrev)</p></li></ul></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ch11.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="ch11.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="pt04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 11. RefDB SRU interface </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Part IV. Reference manual</td></tr></table></div></body></html>
refdb-doc 1.0.2-3ubuntu1 / usr / share / doc / refdb / refdb-manual / ch11s02.html