/usr/share/doc/idzebra-2.0-doc/idzebra-2.0/introduction-apps.html

<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>3. References and Zebra based Applications</title><meta name="generator" content="DocBook XSL Stylesheets V1.75.2"><link rel="home" href="index.html" title="Zebra - User's Guide and Reference"><link rel="up" href="introduction.html" title="Chapter 1. Introduction"><link rel="prev" href="features.html" title="2. Zebra Features Overview"><link rel="next" href="introduction-support.html" title="4. Support"></head><body><link rel="stylesheet" type="text/css" href="common/style1.css"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">3. References and <span class="application">Zebra</span> based Applications</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="features.html">Prev</a> </td><th width="60%" align="center">Chapter 1. Introduction</th><td width="20%" align="right"> <a accesskey="n" href="introduction-support.html">Next</a></td></tr></table><hr></div><div class="section" title="3. References and Zebra based Applications"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="introduction-apps"></a>3. References and <span class="application">Zebra</span> based Applications</h2></div></div></div><p>
   <span class="application">Zebra</span> has been deployed in numerous applications, in both the
   academic and commercial worlds, in application domains as diverse
   as bibliographic catalogues, Geo-spatial information, structured
   vocabulary browsing, government information locators, civic
   information systems, environmental observations, museum information
   and web indexes.
  </p><p>
   Notable applications include the following:
  </p><div class="section" title="3.1. Koha free open-source ILS"><div class="titlepage"><div><div><h3 class="title"><a name="koha-ils"></a>3.1. Koha free open-source ILS</h3></div></div></div><p>
     <a class="ulink" href="http://www.koha.org/" target="_top">Koha</a> is a full-featured
     open-source ILS, initially developed  in 
     New Zealand by Katipo Communications Ltd, and first deployed in
     January of 2000 for Horowhenua Library Trust. It is currently
     maintained by a team of software providers and library technology
     staff from around the globe. 
    </p><p>
     <a class="ulink" href="http://liblime.com/" target="_top">LibLime</a>, 
     a company that is marketing and supporting Koha, adds in
     the new release of Koha 3.0 the <span class="application">Zebra</span>
     database server to drive its bibliographic database.
    </p><p>
     In early 2005, the Koha project development team began looking at
     ways to improve <acronym class="acronym">MARC</acronym> support and overcome scalability limitations
     in the Koha 2.x series. After extensive evaluations of the best
     of the Open Source textual database engines - including MySQL
     full-text searching, PostgreSQL, Lucene and Plucene - the team
     selected <span class="application">Zebra</span>. 
    </p><p>
     "<span class="application">Zebra</span> completely eliminates scalability limitations, because it
     can support tens of millions of records." explained Joshua
     Ferraro, LibLime's Technology President and Koha's Project
     Release Manager. "Our performance tests showed search results in
     under a second for databases with over 5 million records on a
     modest i386 900Mhz test server." 
    </p><p>
     "<span class="application">Zebra</span> also includes support for true boolean search expressions
     and relevance-ranked free-text queries, both of which the Koha
     2.x series lack. <span class="application">Zebra</span> also supports incremental and safe
     database updates, which allow on-the-fly record
     management. Finally, since <span class="application">Zebra</span> has at its heart the <acronym class="acronym">Z39.50</acronym>
     protocol, it greatly improves Koha's support for that critical
     library standard." 
    </p><p> 
     Although the bibliographic database will be moved to <span class="application">Zebra</span>, Koha
     3.0 will continue to use a relational SQL-based database design
     for the 'factual' database. "Relational database managers have
     their strengths, in spite of their inability to handle large
     numbers of bibliographic records efficiently," summed up Ferraro,
     "We're taking the best from both worlds in our redesigned Koha
     3.0. 
     </p><p>
     See also LibLime's newsletter article
      <a class="ulink" href="http://www.liblime.com/newsletter/2006/01/features/koha-earns-its-stripes/" target="_top">
     Koha Earns its Stripes</a>.
     </p></div><div class="section" title="3.2. Kete Open Source Digital Library and Archiving software"><div class="titlepage"><div><div><h3 class="title"><a name="kete-dom"></a>3.2. Kete Open Source Digital Library and Archiving software</h3></div></div></div><p>
     <a class="ulink" href="http://kete.net.nz/" target="_top">Kete</a> is a digital object
     management repository, initially developed  in 
     New Zealand. Initial development has
     been a partnership between the Horowhenua Library Trust and
     Katipo Communications Ltd. funded as part of the Community
     Partnership Fund in 2006.
     Kete is purpose built
     software to enable communities to build their own digital
     libraries, archives and repositories.  
    </p><p>
     It is based on Ruby-on-Rails and MySQL, and integrates  the <span class="application">Zebra</span> server
     and the <span class="application">YAZ</span> toolkit for indexing and retrieval of it's content.
     Zebra is run as separate computer process from the Kete
     application.
     See
     how Kete <a class="ulink" href="http://kete.net.nz/documentation/topics/show/139-managing-zebra" target="_top">manages
     Zebra.</a>
     </p><p>
     Why does Kete wants to use Zebra?? Speed, Scalability and easy
 integration with Koha. Read their
 <a class="ulink" href="http://kete.net.nz/blog/topics/show/44-who-what-why-when-answering-some-of-the-niggly-development-questions" target="_top">detailed
 reasoning here.</a>
    </p></div><div class="section" title="3.3. Emilda open source ILS"><div class="titlepage"><div><div><h3 class="title"><a name="emilda-ils"></a>3.3. Emilda open source ILS</h3></div></div></div><p>
     <a class="ulink" href="http://www.emilda.org/" target="_top">Emilda</a> 
     is a complete Integrated Library System, released under the 
     GNU General Public License. It has a
     full featured Web-OPAC, allowing comprehensive system management
     from virtually any computer with an Internet connection, has
     template based layout allowing anyone to alter the visual
     appearance of Emilda, and is
     <acronym class="acronym">XML</acronym> based language for fast and easy portability to virtually any
     language.
     Currently, Emilda is used at three schools in Espoo, Finland.
    </p><p>
     As a surplus, 100% <acronym class="acronym">MARC</acronym> compatibility has been achieved using the
    <span class="application">Zebra</span> Server from Index Data as backend server. 
    </p></div><div class="section" title="3.4. ReIndex.Net web based ILS"><div class="titlepage"><div><div><h3 class="title"><a name="reindex-ils"></a>3.4. ReIndex.Net web based ILS</h3></div></div></div><p>
     <a class="ulink" href="http://www.reindex.net/index.php?lang=en" target="_top">Reindex.net</a>
     is a netbased library service offering all
     traditional functions on a very high level plus many new
     services. Reindex.net is a comprehensive and powerful WEB system
     based on standards such as <acronym class="acronym">XML</acronym> and <acronym class="acronym">Z39.50</acronym>.
     updates. Reindex supports <acronym class="acronym">MARC21</acronym>, dan<acronym class="acronym">MARC</acronym> eller Dublin Core with
     UTF8-encoding.  
    </p><p>
     Reindex.net runs on GNU/Debian Linux with <span class="application">Zebra</span> and Simpleserver
     from Index 
     Data for bibliographic data. The relational database system
     Sybase 9 <acronym class="acronym">XML</acronym> is used for
     administrative data. 
     Internally <acronym class="acronym">MARCXML</acronym> is used for bibliographical records. Update
     utilizes <acronym class="acronym">Z39.50</acronym> extended services. 
    </p></div><div class="section" title="3.5. DADS - the DTV Article Database Service"><div class="titlepage"><div><div><h3 class="title"><a name="dads-article-database"></a>3.5. DADS - the DTV Article Database
     Service</h3></div></div></div><p>
    DADS is a huge database of more than ten million records, totalling
    over ten gigabytes of data.  The records are metadata about academic
    journal articles, primarily scientific; about 10% of these
    metadata records link to the full text of the articles they
    describe, a body of about a terabyte of information (although the
    full text is not indexed.)
   </p><p>
    It allows students and researchers at DTU (Danmarks Tekniske
    Universitet, the Technical College of Denmark) to find and order
    articles from multiple databases in a single query.  The database
    contains literature on all engineering subjects.  It's available
    on-line through a web gateway, though currently only to registered
    users.
   </p><p>
    More information can be found at
    <a class="ulink" href="http://www.dtv.dk/" target="_top">http://www.dtv.dk/</a> and
    <a class="ulink" href="http://dads.dtv.dk" target="_top">http://dads.dtv.dk</a>
   </p></div><div class="section" title="3.6. Infonet Eprints"><div class="titlepage"><div><div><h3 class="title"><a name="infonet-eprints"></a>3.6. Infonet Eprints</h3></div></div></div><p>
     The InfoNet Eprints service from the 
     <a class="ulink" href="http://www.dtv.dk/" target="_top">
      Technical Knowledge Center of Denmark</a>
     provides access to documents stored in
     eprint/preprint servers and institutional research archives around
     the world. The service is based on Open Archives Initiative metadata
     harvesting of selected scientific archives around the world. These
     open archives offer free and unrestricted access to their contents.
    </p><p>
    Infonet Eprints currently holds 1.4 million records from 16 archives.
    The online search facility is found at
    <a class="ulink" href="http://preprints.cvt.dk" target="_top">http://preprints.cvt.dk</a>.
   </p></div><div class="section" title="3.7. Alvis"><div class="titlepage"><div><div><h3 class="title"><a name="alvis-project"></a>3.7. Alvis</h3></div></div></div><p>
     The <a class="ulink" href="http://www.alvis.info/alvis/" target="_top">Alvis</a> EU
     project run under the 6th Framework (IST-1-002068-STP)
     is building a semantic-based peer-to-peer search engine. A
     consortium of eleven partners from six different European
     Community countries plus Switzerland and China contribute
     with expertise in a broad range of specialties including network
     topologies, routing algorithms, linguistic analysis and
     bioinformatics. 
    </p><p>
     The <span class="application">Zebra</span> information retrieval indexing machine is used inside
     the Alvis framework to
     manage huge collections of natural language processed and
     enhanced <acronym class="acronym">XML</acronym> data, coming from a topic relevant web crawl.
     In this application, <span class="application">Zebra</span> swallows and manages 37GB of <acronym class="acronym">XML</acronym> data
     in about 4 hours, resulting in search times of fractions of
     seconds. 
     </p></div><div class="section" title="3.8. ULS (Union List of Serials)"><div class="titlepage"><div><div><h3 class="title"><a name="uls"></a>3.8. ULS (Union List of Serials)</h3></div></div></div><p>
    The M25 Systems Team
    has created a union catalogue for the periodicals of the
    twenty-one constituent libraries of the University of London and
    the University of Westminster
    (<a class="ulink" href="http://www.m25lib.ac.uk/ULS/" target="_top">http://www.m25lib.ac.uk/ULS/</a>).
    They have achieved this using an
    unusual architecture, which they describe as a
    ``non-distributed virtual union catalogue''.
   </p><p>
    The member libraries send in data files representing their
    periodicals, including both brief bibliographic data and summary
    holdings.  Then 21 individual <acronym class="acronym">Z39.50</acronym> targets are created, each
    using <span class="application">Zebra</span>, and all mounted on the single hardware server.
    The live service provides a web gateway allowing <acronym class="acronym">Z39.50</acronym> searching
    of all of the targets or a selection of them.  <span class="application">Zebra</span>'s small
    footprint allows a relatively modest system to comfortably host
    the 21 servers.
   </p><p>
    More information can be found at
    <a class="ulink" href="http://www.m25lib.ac.uk/ULS/" target="_top">http://www.m25lib.ac.uk/ULS/</a>
   </p></div><div class="section" title="3.9. NLI-Z39.50 - a Natural Language Interface for Libraries"><div class="titlepage"><div><div><h3 class="title"><a name="nli"></a>3.9. NLI-<acronym class="acronym">Z39.50</acronym> - a Natural Language Interface for Libraries</h3></div></div></div><p>
    Fernuniversität Hagen in Germany have developed a natural
    language interface for access to library databases.
    
    In order to evaluate this interface for recall and precision, they
    chose <span class="application">Zebra</span> as the basis for retrieval effectiveness.  The <span class="application">Zebra</span>
    server contains a copy of the GIRT database, consisting of more
    than 76000 records in <acronym class="acronym">SGML</acronym> format (bibliographic records from
    social science), which are mapped to <acronym class="acronym">MARC</acronym> for presentation.
   </p><p>
    (GIRT is the German Indexing and Retrieval Testdatabase.  It is a
    standard German-language test database for intelligent indexing
    and retrieval systems.  See
    <a class="ulink" href="http://www.gesis.org/forschung/informationstechnologie/clef-delos.htm" target="_top">http://www.gesis.org/forschung/informationstechnologie/clef-delos.htm</a>)
   </p><p>
    Evaluation will take place as part of the TREC/CLEF campaign 2003 
    <a class="ulink" href="http://clef.iei.pi.cnr.it" target="_top">http://clef.iei.pi.cnr.it</a>.
    
   </p><p>
    For more information, contact Johannes Leveling
    <code class="email">&lt;<a class="email" href="mailto:Johannes.Leveling@FernUni-Hagen.De">Johannes.Leveling@FernUni-Hagen.De</a>&gt;</code>
   </p></div><div class="section" title="3.10. Various web indexes"><div class="titlepage"><div><div><h3 class="title"><a name="various-web-indexes"></a>3.10. Various web indexes</h3></div></div></div><p>
    <span class="application">Zebra</span> has been used by a variety of institutions to construct
    indexes of large web sites, typically in the region of tens of
    millions of pages.  In this role, it functions somewhat similarly
    to the engine of Google or AltaVista, but for a selected intranet
    or a subset of the whole Web.
   </p><p>
    For example, Liverpool University's web-search facility (see on
    the home page at
    <a class="ulink" href="http://www.liv.ac.uk/" target="_top">http://www.liv.ac.uk/</a>
    and many sub-pages) works by relevance-searching a <span class="application">Zebra</span> database
    which is populated by the Harvest-NG web-crawling software.
   </p><p>
    For more information on Liverpool university's intranet search
    architecture, contact John Gilbertson
    <code class="email">&lt;<a class="email" href="mailto:jgilbert@liverpool.ac.uk">jgilbert@liverpool.ac.uk</a>&gt;</code>
   </p><p>
    Kang-Jin Lee
    has recently modified the Harvest web indexer to use <span class="application">Zebra</span> as
    its native repository engine.  His comments on the switch over
    from the old engine are revealing:
    </p><div class="blockquote"><blockquote class="blockquote"><p>
      The first results after some testing with <span class="application">Zebra</span> are very
      promising.  The tests were done with around 220,000 SOIF files,
      which occupies 1.6GB of disk space.
     </p><p>
      Building the index from scratch takes around one hour with <span class="application">Zebra</span>
      where [old-engine] needs around five hours.  While [old-engine]
      blocks search requests when updating its index, <span class="application">Zebra</span> can still
      answer search requests.
      [...]
      <span class="application">Zebra</span> supports incremental indexing which will speed up indexing
      even further.
     </p><p>
      While the search time of [old-engine] varies from some seconds
      to some minutes depending how expensive the query is, <span class="application">Zebra</span>
      usually takes around one to three seconds, even for expensive
      queries.
      [...]
      <span class="application">Zebra</span> can search more than 100 times faster than [old-engine]
      and can process multiple search requests simultaneously
     </p><p>
      I am very happy to see such nice software available under GPL.
     </p></blockquote></div><p>
   </p></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="features.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="introduction.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="introduction-support.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">2. <span class="application">Zebra</span> Features Overview </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 4. Support</td></tr></table></div></body></html>
idzebra-2.0-doc 2.0.44-3 / usr / share / doc / idzebra-2.0-doc / idzebra-2.0 / introduction-apps.html