/usr/share/doc/libuu-dev/library.html is in libuu-dev 0.5.20-9.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
| <!DOCTYPE html>
<html >
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<meta name="generator" content="hevea 2.28">
<style type="text/css">
.li-itemize{margin:1ex 0ex;}
.li-enumerate{margin:1ex 0ex;}
.dd-description{margin:0ex 0ex 1ex 4ex;}
.dt-description{margin:0ex;}
.toc{list-style:none;}
.footnotetext{margin:0ex; padding:0ex;}
div.footnotetext P{margin:0px; text-indent:1em;}
.thefootnotes{text-align:left;margin:0ex;}
.dt-thefootnotes{margin:0em;}
.dd-thefootnotes{margin:0em 0em 0em 2em;}
.footnoterule{margin:1em auto 1em 0px;width:50%;}
.caption{padding-left:2ex; padding-right:2ex; margin-left:auto; margin-right:auto}
.title{margin:2ex auto;text-align:center}
.titlemain{margin:1ex 2ex 2ex 1ex;}
.titlerest{margin:0ex 2ex;}
.center{text-align:center;margin-left:auto;margin-right:auto;}
.flushleft{text-align:left;margin-left:0ex;margin-right:auto;}
.flushright{text-align:right;margin-left:auto;margin-right:0ex;}
div table{margin-left:inherit;margin-right:inherit;margin-bottom:2px;margin-top:2px}
td table{margin:auto;}
table{border-collapse:collapse;}
td{padding:0;}
.cellpadding0 tr td{padding:0;}
.cellpadding1 tr td{padding:1px;}
pre{text-align:left;margin-left:0ex;margin-right:auto;}
blockquote{margin-left:4ex;margin-right:4ex;text-align:left;}
td p{margin:0px;}
.boxed{border:1px solid black}
.textboxed{border:1px solid black}
.vbar{border:none;width:2px;background-color:black;}
.hbar{border:none;height:2px;width:100%;background-color:black;}
.hfill{border:none;height:1px;width:200%;background-color:black;}
.vdisplay{border-collapse:separate;border-spacing:2px;width:auto; empty-cells:show; border:2px solid red;}
.vdcell{white-space:nowrap;padding:0px; border:2px solid green;}
.display{border-collapse:separate;border-spacing:2px;width:auto; border:none;}
.dcell{white-space:nowrap;padding:0px; border:none;}
.dcenter{margin:0ex auto;}
.vdcenter{border:solid #FF8000 2px; margin:0ex auto;}
.minipage{text-align:left; margin-left:0em; margin-right:auto;}
.marginpar{border:solid thin black; width:20%; text-align:left;}
.marginparleft{float:left; margin-left:0ex; margin-right:1ex;}
.marginparright{float:right; margin-left:1ex; margin-right:0ex;}
.theorem{text-align:left;margin:1ex auto 1ex 0ex;}
.part{margin:2ex auto;text-align:center}
</style>
<title>The UUDeview Decoding Library
</title>
</head>
<body >
<!--HEVEA command line is: /usr/bin/hevea -fix -o library.html library.ltx -->
<!--CUT STYLE article--><!--CUT DEF section 1 --><table class="title"><tr><td style="padding:1ex"><h1 class="titlemain">The UUDeview Decoding Library</h1><h3 class="titlerest">Frank Pilhofer</h3></td></tr>
</table><blockquote class="abstract"><span style="font-weight:bold">Abstract: </span>
The UUDeview library is a highly portable set of functions that
provide facilities for decoding <em>uuencoded</em>, <em>xxencoded</em>,
<em>Base64</em> and <em>BinHex</em>-Encoded files as well as for
encoding binary files into all of these representations except
BinHex. This document describes how the features of encoding
and decoding can be integrated into your own applications.<p>The information is intended for developers only, and is not required
reading material for end users. It is assumed that the reader is
familiar with the general issue of encoding and decoding and has some
experience with the “C” programming language.</p><p>This document describes version 0.5, patchlevel 20
of the library.
</p></blockquote>
<!--TOC section id="sec1" Introduction-->
<h2 id="sec1" class="section">1  Introduction</h2><!--SEC END -->
<!--TOC subsection id="sec2" Background-->
<h3 id="sec2" class="subsection">1.1  Background</h3><!--SEC END --><p>The Internet provides us with a fast and reliable means of user-to-user
message delivery, using private email or newsgroups. Both systems have
originally been designed to transport plain-text messages. Over the
years, some methods appeared allowing transport of arbitrary binary
data by “encoding” the data into plain-text messages. But after
these years, there are still certain problems handling the encoded
data, and many recipients have difficulties decoding the messages back
into their original form.</p><p>It should be the job of the mail delivery agent to handle sending and
rend receiving binary data transparently. However, the support of most
applications is limited, and several incompatibilities among different
software exists.</p><p>There are three common formats for encoding binary data, called
<em>uuencoding</em>, <em>Base64</em> and <em>BinHex</em>. Issues are further
complicated by slight variations of the formats, the packaging, and
some broken implementations.</p><p>Further problems arise with multi-part postings, where the encoding
of a huge file has been split up into several individual messages to
ensure proper transfer over gateways with limited message sizes. Very
few software is able to properly sort and decode the parts. Even
nowadays, many users are at a loss to decode these kinds of messages.</p><p>This is where the UUDeview Decoding Library steps in.</p>
<!--TOC subsection id="sec3" The Library-->
<h3 id="sec3" class="subsection">1.2  The Library</h3><!--SEC END --><p>The UUDeview library makes an attempt at decoding nearly all
kinds of encoded files. It is supposed to decode multi-part files as
well as many files simultaneously. Part numbers are evaluated, thus
making it possible to re-arrange parts that aren’t in their correct
order.</p><p>No assumptions are made on the format of the input file. Usually the
input will be an email folder or newsgroup messages. If this is the
case, the information found in header lines is evaluated; but plain
encoded files with no surrounding information are also accepted. The
input may also consist of concatenated parts and files.</p><p>Decoding files is done in two passes. During the first pass, all input
files are scanned. Information is gathered about each chunk of encoded
data. Besides the obvious data about type, position and size of the
chunk, some environmental information from the envelope of a mail
message is also gathered if available.</p><p>If the scanner finds a properly MIME-formatted message, a proper MIME
parser steps into action. Because MIME messages include precise
information about the message’s contents, there is seldom doubt about
its parts.</p><p>For other, non-MIME messages, the “Subject” header line is closely
examined. Two informations are extracted: the part number (usually
given in parentheses) and a unique identifier, which is used to group
series of postings. If the subject is, for example, “uudeview.tgz
(01/04)”, the scanner concludes that this message is the first in a
series of four, and the indicated filename is an ideal key to identify
each of the four parts.</p><p>If the subject is incomplete (no part number) or missing, the scanner
tries to make the best of the available information, but some of the
advanced features won’t work. For example, without any information
about the part number, it must be assumed that the available parts are
in correct order and can’t be automatically rearranged.</p><p>All the information is gathered in a linked list. An application can
then examine the nodes of the list and pick individual items for
decoding. The decoding functions will then visit the parts of a file
in correct order and extract the binary data.</p><p>Because of heavy testing of the routines against real-life data
and many problem reports from users, the functions have become very
robust, even against input files with few, missing or broken
information.</p><blockquote class="figure"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
(3174,2874)(814,-2548)
(3376,-886)( 0, 1)525
(3376,-361)(-1, 0)1950
(1426,-361)( 0,-1)1800
(1426,-2161)( 1, 0)1950
(3376,-2161)( 0, 1)525
(3376,-1636)(-1, 0)1725
(1651,-1636)( 0, 1)750
(1651,-886)( 1, 0)1650
(3301,-886)( 1, 0) 75
(1351,-2161)(-1, 0)525
(826,-2161)( 0, 1)2475
(826,314)( 1, 0)3150
(3976,314)( 0,-1)2475
(3976,-2161)(-1, 0)525
(3451,-2161)( 0, 1)1875
(3451,-286)(-1, 0)2100
(1351,-286)( 0,-1)1875
(901,-2536)<span class="textboxed">(</span>3000,300)
(1726,-1561)<span class="textboxed">(</span>1650,600)
(2551,-1186)(0,0)[b]1214.4ptUUDeview
(2401,-1861)(0,0)[b]1012.0ptApplication OS
(2401,-2086)(0,0)[b]1012.0ptServices Interface
(2401,-586)(0,0)[b]1012.0ptApplication
(2401,-811)(0,0)[b]1012.0ptLanguage Interface
(2401,-61)(0,0)[b]1214.4ptApplication
(2401,-2461)(0,0)[b]1214.4ptOperating System
(2551,-1456)(0,0)[b]1214.4ptDecoding Library
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Figure 1: Integration of the Library</td></tr>
</table></div>
<a id="structure"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>Figure <a href="#structure">1</a> displays how the library can be integrated into
an application. The library does not assume any capabilities of the
operating system or application language, and can thus be used in
almost any environment. The few necessary interfaces must be provided
by the application, which does usually know a great deal more about
the target system.</p><p>The idea of the “language interface” is to allow integration of the
library services into other programming languages; if the application
is itself written in C, there’s no need for a separate interface, of
course. Such an interface currently exists for the Tcl scripting
language; other examples might be Visual Basic, Perl or Delphi.</p>
<!--TOC subsection id="sec4" Terminology-->
<h3 id="sec4" class="subsection">1.3  Terminology</h3><!--SEC END --><p>These are some buzzwords that will be used in the following text.
</p><ul class="itemize"><li class="li-itemize">
“Encoded data” is binary data encoded by one of the methods
“uuencoding”, “xxencoding”, “Base64” or “BinHex”.
</li><li class="li-itemize">“Message” refers to both complete email messages and Usenet news
postings, including the complete headers. The format of a message is
described in [<a href="#rfc0822">RFC0822</a>]. A “message body” is an email message
or news posting without headers.
</li><li class="li-itemize">A “mail folder” is a number of concatenated messages.
</li><li class="li-itemize">“MIME” refers to the standards set in [<a href="#rfc1521">RFC1521</a>].
</li><li class="li-itemize">A “multipart message” is an entity described by the MIME
standard. It is a single message divided into one or more individual
parts by a unique boundary.
</li><li class="li-itemize">A “partial message” is also described by the MIME standard. It is a
message with an associated identifier and a part number. Large
messages can be split into multiple partial messages on the sender’s
side. The recipient’s software groups the partial messages by their
identifier and composes them back into the original large message.
</li><li class="li-itemize">The term “partial message” only refers to <em>one part</em> of the
large message. The original, partialized message is referred to as
“multi-part message” (note the hyphen). To clarify, one part of a
multi-part message is a partial message.
</li></ul>
<!--TOC section id="sec5" Compiling the Library-->
<h2 id="sec5" class="section">2  Compiling the Library</h2><!--SEC END --><p>On Unix systems, configuration and compilation is trivial. The
script <span style="font-family:monospace">configure</span> automatically checks your
system and configures the library appropriately. A subsequent
“make” compiles the modules and builds the final library.</p><p>On other systems, you must manually create the configuration file and
the Makefile. The configuration file <span style="font-family:monospace">config.h</span> contains a set
of preprocessor definitions and macros that describe the available
features on your systems.</p>
<!--TOC subsection id="sec6" Creating <span style="font-family:monospace">config.h</span> by hand-->
<h3 id="sec6" class="subsection">2.1  Creating <span style="font-family:monospace">config.h</span> by hand</h3><!--SEC END --><p>You can find all available definitions in <span style="font-family:monospace">config.h.in</span>. This
file undefines all possible definitions; you can create your own
configuration file starting from <span style="font-family:monospace">config.h.in</span> and editing the
necessary differences.</p><p>Most definitions are either present or absent, only a few need to have
a value. If not explicitly mentioned, you can activate a definition
by changing the default <span style="font-family:monospace">undef</span> into <span style="font-family:monospace">define</span>.
The following definitions are available:</p>
<!--TOC subsubsection id="sec7" System Specific-->
<h4 id="sec7" class="subsubsection">2.1.1  System Specific</h4><!--SEC END --><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">SYSTEM_DOS</span></span></dt><dd class="dd-description">
Define for compilation on a <em>DOS</em> system. Currently unused.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">SYSTEM_QUICKWIN</span></span></dt><dd class="dd-description">
Define for compilation within a <em>QuickWin</em><sup><a id="text1" href="#note1">1</a></sup>
program. Currently unused.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">SYSTEM_WINDLL</span></span></dt><dd class="dd-description">
Causes all modules to include <span style="font-family:monospace"><windows.h></span> before any other
include file. Makes <span style="font-family:monospace">uulib.c</span> export a
<span style="font-family:monospace">DllEntryPoint</span> function.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">SYSTEM_OS2</span></span></dt><dd class="dd-description">
Causes all modules to include <span style="font-family:monospace"><os2.h></span> before any other
include file.
</dd></dl>
<!--TOC subsubsection id="sec8" Compiler Specific-->
<h4 id="sec8" class="subsubsection">2.1.2  Compiler Specific</h4><!--SEC END --><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">PROTOTYPES</span></span></dt><dd class="dd-description">
Define if your compiler supports function prototypes.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUEXPORT</span></span></dt><dd class="dd-description">
This can be a declaration to all functions exported from the decoding
library. Frequently needed when compiling into a shared library.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">TOOLEXPORT</span></span></dt><dd class="dd-description">
Similar to <span style="font-family:monospace">TOOLEXPORT</span>, but for the helper functions from
the replacement functions in <span style="font-family:monospace">fptools.c</span>.
</dd></dl>
<!--TOC subsubsection id="sec9" Header Files-->
<h4 id="sec9" class="subsubsection">2.1.3  Header Files</h4><!--SEC END --><p>There are a number of options that define whether header files are
available on your system. Don’t worry if some of them are not. If a
header file is present, define “<span style="font-family:monospace">HAVE_</span><em>name-of-header</em>”:
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">ERRNO_H</span>,
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">FCNTL_H</span>,
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">IO_H</span>,
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">MALLOC_H</span>,
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">MEMORY_H</span>,
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">UNISTD_H</span> and
<span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">SYS_TIME_H</span>
(for <span style="font-family:monospace"><sys/time.h></span>). Some other include files are needed
as well, but there are no macros for mandatory include files.</p><p>There’s also a number of header-specific definitions that do not fit
into the general present-or-not-present scheme.</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">STDC_HEADERS</span></span></dt><dd class="dd-description">
Define if your header files conform to <em>ANSI C</em>. This requires
that <span style="font-family:monospace">stdarg.h</span> is present, that <span style="font-family:monospace">stdlib.h</span> is
available, defining both <span style="font-family:monospace">malloc()</span> and <span style="font-family:monospace">free()</span>, and
that <span style="font-family:monospace">string.h</span> defines the memory functions family
(<span style="font-family:monospace">memcpy()</span> etc).
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">HAVE_STDARG_H</span></span></dt><dd class="dd-description">
Implicitly set by <span style="font-family:monospace">STDC</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">HEADERS</span>. You only need to define
this one if <span style="font-family:monospace">STDC</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">HEADERS</span> is not defined but
<span style="font-family:monospace"><stdarg.h></span> is available.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">HAVE_VARARGS_H</span></span></dt><dd class="dd-description">
<em>varargs</em> can be used as an alternative to <em>stdarg</em>. Define
if the above two values are undefined and <span style="font-family:monospace"><varargs.h></span> is
available.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">TIME_WITH_SYS_TIME</span></span></dt><dd class="dd-description">
Define if <span style="font-family:monospace">HAVE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">SYS</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">TIME_H</span> and if both <span style="font-family:monospace"><sys/time.h></span>
and <span style="font-family:monospace"><time.h></span> can be included without conflicting definitions.
</dd></dl>
<!--TOC subsubsection id="sec10" Functions-->
<h4 id="sec10" class="subsubsection">2.1.4  Functions</h4><!--SEC END --><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">HAVE_STDIO</span></span></dt><dd class="dd-description">
Define if standard I/O (<span style="font-family:monospace">stdin</span>, <span style="font-family:monospace">stdout</span> and
<span style="font-family:monospace">stderr</span>) is available.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">HAVE_GETTIMEOFDAY</span></span></dt><dd class="dd-description">
Define if your system provides the <span style="font-family:monospace">gettimeofday()</span> system
call, which is needed to provide microsecond resolution to the
busy callback. If this function is not available, <span style="font-family:monospace">time()</span> is
used.
</dd></dl>
<!--TOC subsubsection id="sec11" Replacement Functions-->
<h4 id="sec11" class="subsubsection">2.1.5  Replacement Functions</h4><!--SEC END --><p>The tools library <span style="font-family:monospace">fptools</span> defines many functions that aren’t
standard on all systems. Most of them do not differ in behavior from
their originals, but might be slightly slower. But since they are
usually only needed in non-speed-critical sections, the replacements
are used throughout the library. For a full listing of the available
replacement functions, see section <a href="#chap-rf">11</a>.</p><p>However, there are two functions, <span style="font-family:monospace">strerror</span> and
<span style="font-family:monospace">tempnam</span>, that aren’t fully implemented. The replacement
<span style="font-family:monospace">strerror</span> does not have a table of error messages and only
produces the error number as string, and the “fake”
<span style="font-family:monospace">tempnam</span> does not necessarily use a proper temp directory.</p><p>Because some functionality is missing, the replacement functions should
<em>only</em> be used if the original is not available.
</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">strerror</span></span></dt><dd class="dd-description">
If your system does not provide a <span style="font-family:monospace">strerror</span> function of its
own, define to <span style="font-family:monospace">_FP_strerror</span>. This causes the replacement
function to be used throughout the library.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">tempnam</span></span></dt><dd class="dd-description">
If your system does not provide a <span style="font-family:monospace">tempnam</span> function of its
own, define to <span style="font-family:monospace">_FP_tempnam</span>. This causes the replacement
function to be used throughout the library. Must not be defined if the
function is in fact available.
</dd></dl>
<!--TOC subsection id="sec12" Creating the <span style="font-family:monospace">Makefile</span> by hand-->
<h3 id="sec12" class="subsection">2.2  Creating the <span style="font-family:monospace">Makefile</span> by hand</h3><!--SEC END --><p>The <span style="font-family:monospace">Makefile</span> is automatically generated by the configuration
script from the template in <span style="font-family:monospace">Makefile.in</span>. This section
explains how the template must be edited into a proper Makefile.</p><p>Just copy <span style="font-family:monospace">Makefile.in</span> to <span style="font-family:monospace">Makefile</span> and edit the
place-holders for the following values.
</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">CC</span></span></dt><dd class="dd-description">
Your system’s “C” compiler.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">CFLAGS</span></span></dt><dd class="dd-description">
The compilation flags to be passed to the compiler. This must include
“-I.” so that the include files from the local directory are found,
and “-DHAVE_CONFIG_H” to declare that a configuration file
is present.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">RANLIB</span></span></dt><dd class="dd-description">
Set to “ranlib” if such a program is available on your system, or to
“:” (colon) otherwise.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">VERSION</span></span></dt><dd class="dd-description">
A string holding the release number of the library, currently
“0.5”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">PATCH</span></span></dt><dd class="dd-description">
A string holding the patchlevel, currently “20”.
</dd></dl><p>Some systems do not know Makefiles but offer the concept of a
“project”.<sup><a id="text2" href="#note2">2</a></sup> In
this case, create a new project targeting a library and add all
source codes to the project. Then, make sure that the include path
includes the current directory. Add options to the compiler command
so that the symbol “HAVE_CONFIG_H” gets defined.
Additionally, the symbol “VERSION” must be defined as a
string holding the release number, currently “0.5” and
“PATCH” must be defined as a string holding the patch level,
currently “20”.</p><p>On 16-bit systems, the package should be compiled using the “Large”
memory model, so that more than just 64k data space is available.</p>
<!--TOC subsection id="sec13" Compiling your Projects-->
<h3 id="sec13" class="subsection">2.3  Compiling your Projects</h3><!--SEC END --><p>Compiling the parts of your project that use the functions from the
decoding library is pretty straightforward:
</p><ul class="itemize"><li class="li-itemize">
All modules that call library functions must include the
<span style="font-family:monospace"><uudeview.h></span> header file.
</li><li class="li-itemize">Optionally, if you want to use the replacement functions to make
your own application more portable, they may also include
<span style="font-family:monospace"><fptools.h></span>.
</li><li class="li-itemize">If your compiler understands about function prototypes, define
the symbol <span style="font-family:monospace">PROTOTYPES</span>. This causes the library functions to
be declared with a full parameter list.
</li><li class="li-itemize">Modify the include file search path so that the compiler finds
the include files (usually with the “-I” option).
</li><li class="li-itemize">Link with the <span style="font-family:monospace">libuu.a</span> library, usually using the
“-luu” option.
</li><li class="li-itemize">Make sure the library is found (usually with the “-L” option).
</li></ul>
<!--TOC section id="sec14" Callback Functions-->
<h2 id="sec14" class="section">3  Callback Functions</h2><!--SEC END -->
<!--TOC subsection id="sec15" Intro-->
<h3 id="sec15" class="subsection">3.1  Intro</h3><!--SEC END --><p>At some points, the decoding library offers to call your custom
procedures to do jobs you want to take care of yourself. Some examples
are the “Message Callback” to print a message or the “Busy
Callback”, which is frequently called during lengthy processing
of data to indicate the progress. You can hook up your functions by
calling some library function with a pointer to your function as a
parameter.</p><p>In some cases, you will want that one of your functions receives
certain data as a parameter. One reason to achieve this would be
through global data; another possibility is provided through the
passing of an opaque data pointer.</p><p>All callback functions are declared to take an additional parameter of
type <span style="font-family:monospace">void*</span>. When hooking up one of your callbacks, you can
specify a value that will passed whenever your function is
called. Since this pointer is never touched by the library, it can be
any kind of data, usually some composed structure. Some application
for the Message Callback might be a <span style="font-family:monospace">FILE*</span> pointer to log the
messages to.</p><p>For portability reasons, you should declare your callbacks with the
first parameter actually being a <span style="font-family:monospace">void*</span> pointer and only cast
this pointer to its real type within the function body. This prevents
compiler warnings about the callback setup.</p>
<!--TOC subsection id="sec16" Message Callback-->
<h3 id="sec16" class="subsection">3.2  Message Callback</h3><!--SEC END --><p>
<a id="Section-Msg-Callback"></a></p><p>For portability reasons, the library does not assume the availability
of a terminal, so it does not initially know where to print messages
to. The library generates some messages about its progress as well
as more serious warnings and errors. An application should provide a
message callback that displays them. The function might also choose to
ignore informative messages and only display the fatal ones.</p><p>A Message Callback takes three parameters. The first one is the opaque
data pointer of type <span style="font-family:monospace">void*</span>. The second one is a text message
of more or less arbitrary length without line breaks. The last
parameter is an indicator of the seriousness of this message. A string
representation of the warning level is also prefixed to the message.
</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UUMSG_MESSAGE</span></span></dt><dd class="dd-description">
This is just a plain informative message, nothing important. The
application can choose to simply ignore the message. If a log file
is available, it should be logged, but the message should never result
in a modal dialogue.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUMSG_NOTE</span></span></dt><dd class="dd-description"> “Note:”
Still an informative message, meaning that the library made a decision
on its own that might interest the user. One example for a note is
that the setuid bit has been stripped from a file mode for security
reasons. Notes are nothing serious and may still be ignored.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUMSG_WARNING</span></span></dt><dd class="dd-description"> “Warning:”
A warning indicates that a non-serious problem occurred which did not
stop the library from proceeding with the current action. One example
is a temporary file that could not be removed. Warnings should be
displayed, but an application may decide to continue even without user
intervention.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUMSG_ERROR</span></span></dt><dd class="dd-description"> “ERROR:”
A problem occurred that caused termination of the current request, for
example if the library tried to access a non-existing file. After an
error has occurred, the application should closely examine the
resulting return code of the operation. Error messages are usually
printed in modal dialogues; another option is to save the error
message string somewhere and later print the error message after the
application has examined the operation’s return value.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUMSG_FATAL</span></span></dt><dd class="dd-description"> “Fatal Error:”
This would indicate that a serious problem has occurred that prevents
the library from processing any more requests. Currently unused.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUMSG_PANIC</span></span></dt><dd class="dd-description"> “Panic:”
Such a message would indicate a panic condition, meaning the
application should terminate without further clean-up handling.
Unused so far.<sup><a id="text3" href="#note3">3</a></sup>
</dd></dl>
<!--TOC subsection id="sec17" Busy Callback-->
<h3 id="sec17" class="subsection">3.3  Busy Callback</h3><!--SEC END --><p>
<a id="Section-Busy-Callback"></a></p><p>Some library functions, like scanning of an input file or decoding an
output file, can take quite some time. An application will usually
want to inform the user of the progress. A custom “Busy Callback”
can be provided to take care of this job. This function will then be
called frequently while a large action is being executed within the
library. It is not called when the application itself has control.</p><p>Apart from the usual opaque data pointer, the Busy Callback receives a
structure of type <span style="font-family:monospace">uuprogress</span> with the following members:
</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">action</span></span></dt><dd class="dd-description">
What the library is currently doing. One of the following integer
constants:
<dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UUACT_IDLE</span></span></dt><dd class="dd-description">
The library is idle. This value shouldn’t be seen in the Busy
Callback, because the Busy Callback is never called in an idle state.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUACT_SCANNING</span></span></dt><dd class="dd-description"> Scanning an input file.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUACT_DECODING</span></span></dt><dd class="dd-description"> Decoding a file.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUACT_COPYING</span></span></dt><dd class="dd-description"> Copying a file.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUACT_ENCODING</span></span></dt><dd class="dd-description"> Encoding a file.
</dd></dl>
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">curfile</span></span></dt><dd class="dd-description">
The name of the file we’re working on. May include the full
path. Guaranteed to be 256 characters or shorter.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">partno</span></span></dt><dd class="dd-description">
When decoding a file, this is the current part number we’re working
on. May be zero.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">numparts</span></span></dt><dd class="dd-description">
The maximum part number of this file. Guaranteed to be positive
(non-zero).
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">percent</span></span></dt><dd class="dd-description">
The percentage of the current <em>part</em> already processed. The total
percentage can be calculated as (100*<span style="font-style:italic">partno</span>−<span style="font-style:italic">percent</span>)/<span style="font-style:italic">numparts</span>.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">fsize</span></span></dt><dd class="dd-description">
The size of the current file. The percent information is only valid if
this field is <em>positive</em>. Whenever the size of a file cannot be
properly determined, this field is set to -1; in this case, the
percent field may hold garbage.
</dd></dl><p>In some cases, it is possible that the percent counter jumps
backwards. This happens seldom enough not to worry about it, but the
callback should take care not to crash in this case.<sup><a id="text4" href="#note4">4</a></sup></p><p>The Busy Callback is declared to return an integer value. If a
<em>non-zero</em> value is returned, the current operation from
which the callback was called is canceled, which then aborts with
a return value of <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">CANCEL</span> (see later).</p>
<!--TOC subsection id="sec18" File Callback-->
<h3 id="sec18" class="subsection">3.4  File Callback</h3><!--SEC END --><p>
<a id="Section-File-Callback"></a></p><p>Input files are usually needed twice, first for scanning and then for
decoding. If the input files are downloaded from a remote server,
perhaps by <em>NNTP</em>, they would have to be stored on the local disk
and await further handling. However, the user may choose not to decode
some files after all.</p><p>If disk space is important, it is possible to install a “File
Callback”. When scanning a file, it is assigned an “Id”. After
scanning has completed, the application can delete the input file. If
it should be required later on for decoding, the File Callback is
called to map the Id back to a filename, possibly retrieving
another copy and disposing of it afterwards.</p><p>The File Callback receives four parameters. The first is the opaque
data pointer, the second is the Id that was assigned to the file while
scanning. The fourth parameter is an integer. If it is non-zero, then
the function is supposed to retrieve the file in question, store it on
local disk, and write the resulting filename into the area to which
the third parameter (a <span style="font-family:monospace">char*</span> pointer) points. A fourth
parameter of zero indicates that the decoder is done handling the
file, so that the function can decide whether or not to remove the
file.</p><p>The function must return <span style="font-family:monospace">UURET_OK</span> upon success, or any other
appropriate error code upon failure.</p><p>Since it can usually be assumed that disk space is plentily available,
and storing a file is “cheaper” than retrieving it twice, this
mechanism has not been used so far.</p>
<!--TOC subsection id="sec19" Filename Filter-->
<h3 id="sec19" class="subsection">3.5  Filename Filter</h3><!--SEC END --><p>
<a id="Section-FName-Filter"></a></p><p>For portability reasons, the library does not make any assumptions of
the legality of certain filenames. It will pick up a “garbage” file
name from the encoded file and happily use it if not told
otherwise. For example, on DOS systems many filenames must be
truncated in order to be valid.</p><p>If a “Filename Filter” is installed, the library will pass each
potential filename to the filter and then use the filename that the
filter function returns. The filter also has to remove all directory
information from the filename – the library itself does not know
about directories at all.</p><p>The filter function receives the potential filename as string and must
return a pointer to a string with the corrected filename. It may
either return a pointer to some position in the original string or a
pointer to some static area, but it should not modify the source
string.</p><p>Two examples of filename filters can be found among the UUDeview
distribution as <span style="font-family:monospace">uufnflt.c</span>. The DOS filter function disposes
directory information, uses only the first 8 characters of the base
filename and the first three characters after the last ’.’ (since a
filename might have two extensions). Also, space characters are
replaced by underscores. The Unix filter just returns a pointer to the
filename part of the name (without directory information).</p><p>The “garbage” filename mentioned above was just for the sake of
argument. It is generally safe to assume that the input filename is
not too weird; after all, it is a filename valid on <em>some</em>
system. Still, the user should always be granted the possibility of
renaming a file before decoding it, to allow decoding of files with
insane filenames.</p>
<!--TOC section id="sec20" The File List-->
<h2 id="sec20" class="section">4  The File List</h2><!--SEC END --><p>
<a id="file-list"></a></p><p>While scanning the input files, a linked list is built. Each node is
of type <span style="font-family:monospace">uulist</span> and describes one file, possibly composed of
several parts. This section describes the members of the structure
that may be of interest to an application.</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">state</span></span></dt><dd class="dd-description">
Describes the state of this file. Either the value
<span style="font-family:monospace">UUFILE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">READ</span><sup><a id="text5" href="#note5">5</a></sup> or a
bitfield of the following values:
<dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UUFILE_MISPART</span></span></dt><dd class="dd-description">
The file is missing at least one part. This bit is set if the part
numbers are non-sequential. Usually results in incorrect decoding.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_NOBEGIN</span></span></dt><dd class="dd-description">
No “begin” line was detected. Since <em>Base64</em>
files do not have begin lines, this bit is never set on them.
For <em>BinHex</em> files, the initial colon is used.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_NOEND</span></span></dt><dd class="dd-description">
No “end” line was detected. Since <em>Base64</em>
files do not have end lines, this bit is never set on them. A missing
end on <em>uuencoded</em> or <em>xxencoded</em> files usually means that
the file is incomplete. For <em>BinHex</em>, the trailing colon is
used as end marker.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_NODATA</span></span></dt><dd class="dd-description">
No encoded data was found within these parts.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_OK</span></span></dt><dd class="dd-description">
This file appears to be okay, and decoding is likely to be successful.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_ERROR</span></span></dt><dd class="dd-description">
A decode operation was attempted, but failed, usually because of an
I/O error.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_DECODED</span></span></dt><dd class="dd-description">
This file has already been successfully decoded.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUFILE_TMPFILE</span></span></dt><dd class="dd-description">
The file has been decoded into a temporary file, which can be found
using the <span style="font-family:monospace">binfile</span> member (see below). This flag gets removed
if the temporary file is deleted.
</dd></dl>
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">mode</span></span></dt><dd class="dd-description">
For <em>uuencoded</em> and <em>xxencoded</em> files, this is the file mode
found on the “begin” line, <em>Base64</em> and <em>BinHex</em> files
receive a default of 0644. A decode operation will try to restore this
mode.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">uudet</span></span></dt><dd class="dd-description">
The type of encoding this file uses. May be 0 if
<span style="font-family:monospace">UUFILE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">NODATA</span> or one of the following
values:
<dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UU_ENCODED</span></span></dt><dd class="dd-description"> for <em>uuencoded</em> data,
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">B64ENCODED</span></span></dt><dd class="dd-description"> for <em>Base64</em> encoded data,
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">XX_ENCODED</span></span></dt><dd class="dd-description"> for <em>xxencoded</em> data,
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">BH_ENCODED</span></span></dt><dd class="dd-description"> for <em>BinHex</em> data,
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">PT_ENCODED</span></span></dt><dd class="dd-description"> for plain-text “data”, or
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">QT_ENCODED</span></span></dt><dd class="dd-description"> for MIME <em>quoted-printable</em> encoded
text.
</dd></dl>
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">size</span></span></dt><dd class="dd-description">
The approximate size of the resulting file. It is an estimated value
and can be a few percent off the final value, hence the suggestion to
display the size in kilobytes only.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">filename</span></span></dt><dd class="dd-description">
The filename. For <em>uuencoded</em> and <em>xxencoded</em> files, it is
extracted from the “begin” line. The name of <em>BinHex</em> files
is encoded in the first data bytes. <em>Base64</em> files have the
filename given in the “Content-Type” header. This field may be
<span style="font-family:monospace">NULL</span> if <span style="font-family:monospace">state!=UUFILE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span>.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">subfname</span></span></dt><dd class="dd-description">
A unique identifier for this group of parts, usually derived from the
“Subject” header of each part. It is possible that two
nodes with the same identifier exist in the file list: If a group of
files is considered “complete”, a new node is opened up for more
parts with the same Id.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">mimeid</span></span></dt><dd class="dd-description">
Stores the “id” field from the “Content-Type” information if
available. Actually, this Id is the first choice for grouping of
files, but not surprisingly, non-MIME mails or articles do not have
this information.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">mimetype</span></span></dt><dd class="dd-description">
Stores this part’s “Content-Type” if available.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">binfile</span></span></dt><dd class="dd-description">
After decoding, this is the name of the temporary file the data was
decoded to and stored in. This value is non-NULL if the flag
<span style="font-family:monospace">UUFILE</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">TMPFILE</span> is set in the state member above.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">haveparts</span></span></dt><dd class="dd-description">
The part numbers found for this group of files as a zero-terminated
ordered integer array. Some extra care must be taken, because a file
may have a zeroth part as its first part. Thus if
<span style="font-family:monospace">haveparts[0]</span> is zero, it indicates a zeroth part, and the
list of parts continues. A file may have at most one zeroth part, so
if both <span style="font-family:monospace">haveparts[0]</span> and <span style="font-family:monospace">haveparts[1]</span> are zero, the
zeroth part is the only part of this file.<p>No more than 256 parts are listed here.
</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">misparts</span></span></dt><dd class="dd-description">
Similar to <span style="font-family:monospace">haveparts</span>; a zero-terminated ordered integer array
of missing parts, or simply <span style="font-family:monospace">NULL</span> if no parts are
missing. Since we don’t mind if a file doesn’t have a zeroth part,
this array does not have the above problems.
</dd></dl>
<!--TOC section id="sec21" Return Values-->
<h2 id="sec21" class="section">5  Return Values</h2><!--SEC END --><p>Most of the library functions return a value indicating success or the
type of error occurred. The following values can be returned:</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UURET_OK</span></span></dt><dd class="dd-description">
The action completed successfully.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_IOERR</span></span></dt><dd class="dd-description">
An I/O error occurred. There may be many reasons from “File not
found” to “Disk full”. This return code indicates that the
application should consult <span style="font-family:monospace">errno</span> for more information.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_NOMEM</span></span></dt><dd class="dd-description">
A <span style="font-family:monospace">malloc()</span> operation returned <span style="font-family:monospace">NULL</span>, indicating that
memory resources are exhausted. Never seen this one in a VM system.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_ILLVAL</span></span></dt><dd class="dd-description">
You tried to call some operation with invalid parameters.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_NODATA</span></span></dt><dd class="dd-description">
An attempt was made to decode a file, but no encoded data was found
within its parts. Also returned if decoding a <em>uuencoded</em> or
<em>xxencoded</em> file with missing “begin” line.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_NOEND</span></span></dt><dd class="dd-description">
A decoding operation was attempted, but the decoded data didn’t have a
proper “end” line. A similar condition can also be detected for
<em>BinHex</em> files (where the colon is used as end marker).
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_UNSUP</span></span></dt><dd class="dd-description">
You tried to encode using an unsupported communications channel, for
example piping to a command on a system without pipes.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_EXISTS</span></span></dt><dd class="dd-description">
The target file already exists (upon decoding), and you didn’t allow
to overwrite existing files.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_CONT</span></span></dt><dd class="dd-description">
This is a special return code, indicating that the current operation
must be continued. This return value is used only by two encoding
functions, so see the documentation there.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_CANCEL</span></span></dt><dd class="dd-description">
The current operation was canceled, meaning that the Busy Callback
returned a non-zero value usually because of user request. The library
does not produce this return value on its own, so if your Busy
Callback always returns zero, there’s no need to handle this
“Error”.
</dd></dl>
<!--TOC section id="sec22" Options-->
<h2 id="sec22" class="section">6  Options</h2><!--SEC END --><p>
<a id="Section-Options"></a>
An application program can set and query a number of options. Some of
them are read-only, but others can modify the behavior quite
drastically. Some of them are intended to be set by the end user via
an options menu.</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UUOPT_VERSION</span></span></dt><dd class="dd-description"> <span style="font-size:small">(string, read-only)</span> <br>
Retrieves the full version number of the library, composed as
<em>MAJOR</em>.<em>MINOR</em>pl<em>PATCH</em>
(the major and minor version
numbers and the patchlevel are integers).</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_FAST</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
If set to 1, the library will assume that each input file consists of
exactly one email message or newsgroup posting. After finding encoded
data within a file, the scanner will not continue to look for more
data below. This strategy can save a lot of time, but has the drawback
that files also cannot be checked for completeness – since the
scanner does not look for “end” lines, we don’t notice them missing.<p>This flag does not have any effect on MIME multipart messages, which
are always scanned to the end (alas, the Epilogue will be skipped).
Actually, with this flag set, the scanner becomes more MIME-compliant.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_DUMBNESS</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
As already mentioned, the library evaluates
information found in the part’s “Subject” header line if
available. The heuristics here are versatile, but cannot be guaranteed
to be completely failure-proof. If false information is derived, the
parts will be ordered and grouped wrong, resulting in wrong decoding.<p>If the “dumbness” is set to 1, the code to derive a part number is
disabled; it will then be assumed that all parts within a group appear
in correct order: the first one is assigned number 1 etc. However,
part numbers found in MIME-headers are still used (I haven’t yet found
a file where these were wrong).</p><p>A dumbness of 2 also switches off the code to select a unique
identifier from the subject line. This does still work with
single-part files<sup><a id="text6" href="#note6">6</a></sup> and <em>might</em> work with multi-part files, as long as
they’re in correct order and not mixed. The filename is found on
the first part and then passed on to the following parts.</p><p>This option only takes effect for files scanned afterwards.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_BRACKPOL</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
Series of multi-part postings on the Usenet usually have subject lines
like “You must see this! [1/3] (2/4)”. How to parse this
information? Is this the second part of four in a series of three
postings, or is it the first of three parts and the second in a series
of four postings? The library cannot know, and simply gives numbers in
() parentheses precedence over number in [] brackets. If this
assumption fails, the parts will be grouped and ordered completely
wrong.<p>Setting the “bracket policy” to 1 changes this precedence.
If now both parentheses and brackets are present, the
numbers within brackets will be evaluated first.</p><p>This option only takes effect for files scanned afterwards.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_VERBOSE</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=1)</span> <br>
If set to 0, the Message Callback will not be bothered with messages
of level
<span style="font-family:monospace">UUMSG</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">MESSAGE</span> or
<span style="font-family:monospace">UUMSG</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">NOTE</span>.
The default is to generate these messages.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_DESPERATE</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
By default, the library refuses to decode incomplete files and
generates errors. But if switched into “desperate mode” these kinds
of errors are ignored, and all <em>available</em> data is decoded.
The usefulness of the resulting corrupt file depends on the type of
the file.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_IGNREPLY</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
If set to 1, the library will ignore email messages and news postings
which were sent as “Reply”, since they are less likely to feature
useful data. There’s no real reason to turn on this option any more
(earlier versions got easily confused by replies).</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_OVERWRITE</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=1)</span> <br>
When the decoder finds that the target file already exists, it is
simply overwritten silently by default. If this option is set to 0,
the decoder fails instead, generating a
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">EXIST</span> error.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_SAVEPATH</span></span></dt><dd class="dd-description"> <span style="font-size:small">(string, default=(empty))</span> <br>
Without setting this option, files are decoded to the current
directory. This “save path” is handled as prefix to each
filename. Because the library does not know about directory layouts,
the resulting filename is simply the concatenation of the save path
and the target file, meaning that the path must include a final
directory separator (slash, backslash, or whatever).</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_IGNMODE</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
Usually, the decoder tries to restore the file mode found on the
“begin” line of <em>uuencoded</em> and <em>xxencoded</em> files. This is
turned off if this option is set to 1.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_DEBUG</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
If set to 1, all messages will be prefixed with the exact sourcecode
location (filename and line number) where they were created. Might be
useful if this is not clear from context.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_ERRNO</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, read-only)</span> <br>
This “option” can be queried after an operation failed with
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">IOERR</span> and returns the
<span style="font-family:monospace">errno</span> value that originally caused the problem. The “real”
value of this variable might already be obscured by secondary
problems.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_PROGRESS</span></span></dt><dd class="dd-description"> <span style="font-size:small">(uuprogress, read-only)</span> <br>
Returns the progress structure. This would only make sense in
multi-threaded environments where the decoder runs in one thread and
is controlled from another. Although some care is taken while updating
the structure’s values, there might still be synchronization problems.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_USETEXT</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
If this flag is true, plain text files will be presented for
“decoding”. This includes non-decodeable messages as well as
plain-text parts from MIME multipart messages. Since they usually
don’t have associated filenames, a unique name will be created from a
sequential four-digit number.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_PREAMB</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
Whether to use the plain-text preamble and epilogue from MIME
multipart messages. The standard defines they’re supposed to
be ignored, so there’s no real reason to set this option.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_TINYB64</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
Support for tiny Base64 data.
If set to off, the scanner does not recognize stand-alone Base64
encoded data with less than 3 lines. The problem is that in some
cases plain text might be misinterpreted as Base64 data, since,
for example, any four-character alphanumerical string like “Argh”
appearing on a line of its own is valid Base64 data. Since encoded
files are usually longer, and there is considerable confusion about
erroneous Base64 detection, this option is off by default. There’s
probably no need to present this option separately to the user. It’s
reasonable to associate it with the “desperate mode” described
above.<p>Note that this option only affects <em>stand-alone</em> data. Input
from Mime messages with the encoding type correctly specified in
the “Content-Transfer-Encoding” header is always evaluated.</p><p>There is also no problem with encoding types different than Base64,
since they have an explicit notion of the beginning and end of a
file, and no danger of misinterpretation exists.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_ENCEXT</span></span></dt><dd class="dd-description"> <span style="font-size:small">(string, default=(empty))</span> <br>
When encoding into a file on the local disk, the target files
usually receive an extension composed of the three-digit part
number. This may be considered inappropriate for single-part files.
If this option is set, its value is attached to the base file name as
extension for the target file. A dot ‘.’ is inserted automatically.
When using uuencoding, a sensible value might be “uue”.<p>This option does not alter the behaviour on multi-part files, where
the individual parts always receive the three-digit part number as
extension.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_REMOVE</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
If true, input files are deleted if data was successfully decoded from
them. Be careful with this option, as the library does not care if the
file contains any other useful information besides the decoded
data. And it also does not and can not check the integrity of the
decoded file. Therefore, if in doubt of the incoming data, you should
do a confidence check first and then delete the relevant input files
afterwards. But then, this option was requested by many users.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UUOPT_MOREMIME</span></span></dt><dd class="dd-description"> <span style="font-size:small">(integer, default=0)</span> <br>
Makes the library behave more MIME-compliant. Normally, some liberties
are taken with seemingly MIME files in order to find encoded data
within it, therefore also finding files within broken MIME
messages. If this option is set to 1, the library is more strict in
its handling of MIME files, and will for example not allow Base 64
data outside of properly tagged subparts, and will not accept
“random” encoded data.<p>You can also set the value of this option to 2 to enforce strict MIME
adherance. If the option is 1, the library will still look into plain
text attachments and try to find encoded data within it. This causes
for example uuencoded files that were then sent in a MIME envelope to
be recognized. With an option value of 2, the library won’t even do
that, trusting all MIME header information.
</p></dd></dl>
<!--TOC section id="sec23" General Functions-->
<h2 id="sec23" class="section">7  General Functions</h2><!--SEC END --><p>After describing all the framework in the previous chapters, it is
time to mention some function calls. Still, the functions presented
here don’t actually <em>do</em> anything, they just query and modify the
behavior of the core functions.</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">int UUInitialize (void)</span></span></dt><dd class="dd-description"> <br>
This function initializes the library and must be called before any
other decoding or encoding function. During initialization, several
arrays are allocated. If memory is exhausted,
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">NOMEM</span> is returned, otherwise the initialization
will return successfully with <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span>.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUCleanUp (void)</span></span></dt><dd class="dd-description"> <br>
Cleans up all resources that have been allocated during a program run:
memory structures, temporary files and everything. No library function
may be called afterwards, with the exception of <span style="font-family:monospace">UUInitialize</span>
to start another run.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUGetOption (int opt, int *ival, char *cval, int len)</span></span></dt><dd class="dd-description"> <br>
Retrieves the configuration option (see section <a href="#Section-Options">6</a>)
opt. If the option is integer, it is stored in <span style="font-family:monospace">ival</span> (only if
<span style="font-family:monospace">ival!=NULL</span>) and also returned as return value. String options
are copied to <span style="font-family:monospace">cval</span>. Including the final nullbyte, at most
<span style="font-family:monospace">len</span> characters are written to <span style="font-family:monospace">cval</span>. If the progress
information is queried with
<span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">PROGRESS</span>, <span style="font-family:monospace">cval</span> must
point to a <span style="font-family:monospace">uuprogress</span> structure and <span style="font-family:monospace">len</span> must equal
<span style="font-family:monospace">sizeof(uuprogress)</span>.<p>For integer options, <span style="font-family:monospace">cval</span> may be NULL and <span style="font-family:monospace">len</span> 0 and
vice versa: for string options, <span style="font-family:monospace">ival</span> is not evaluated.
</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUSetOption (int opt, int ival, char *cval)</span></span></dt><dd class="dd-description"> <br>
Sets one of the configuration options. Integer options are set via
<span style="font-family:monospace">ival</span> (<span style="font-family:monospace">cval</span> may be <span style="font-family:monospace">NULL</span>), and string options
are copied from the null-terminated string <span style="font-family:monospace">cval</span>
(<span style="font-family:monospace">ival</span> may be 0). Returns
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">ILLVAL</span> if you try to set a
read-only value, or <span style="font-family:monospace">UURET_OK</span> otherwise.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *UUstrerror (int errcode)</span></span></dt><dd class="dd-description"> <br>
Maps the return values <span style="font-family:monospace">UURET_*</span> into error messages:
<dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">UURET_OK</span></span></dt><dd class="dd-description"> “OK”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_IOERR</span></span></dt><dd class="dd-description"> “File I/O Error”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_NOMEM</span></span></dt><dd class="dd-description"> “Not Enough Memory”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_ILLVAL</span></span></dt><dd class="dd-description"> “Illegal Value”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_NODATA</span></span></dt><dd class="dd-description"> “No Data found”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_NOEND</span></span></dt><dd class="dd-description"> “Unexpected End of File”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_UNSUP</span></span></dt><dd class="dd-description"> “Unsupported function”
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">UURET_EXISTS</span></span></dt><dd class="dd-description"> “File exists”
</dd></dl>
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUSetMsgCallback (void *opaque, void (*func) ())</span></span></dt><dd class="dd-description"> <br>
Sets the Message Callback function to <span style="font-family:monospace">func</span> (see section
<a href="#Section-Msg-Callback">3.2</a>). <span style="font-family:monospace">opaque</span> is the opaque data
pointer that is passed untouched to the callback whenever it is
called. To prevent compiler warnings, a prototype of the callback
should appear before this line. Always returns
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span>. If <span style="font-family:monospace">func==NULL</span>, the callback is
disabled.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUSetBusyCallback (void *, void (*func) (), long msecs)</span></span></dt><dd class="dd-description"> <br>
Sets the Busy Callback function to <span style="font-family:monospace">func</span> (see section
<a href="#Section-Busy-Callback">3.3</a>). <span style="font-family:monospace">msecs</span> gives a timespan in
milliseconds; the library will try to call the callback after this
timespan has passed. On some systems, the time can only be queried
with second resolution – in that case, timing will be quite
inaccurate. The semantics for the other two parameters are the same as
in the previous function. If <span style="font-family:monospace">func==NULL</span>, the busy callback is
disabled.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUSetFileCallback (void *opaque, int (*func) ())</span></span></dt><dd class="dd-description"> <br>
Sets the File Callback function to <span style="font-family:monospace">func</span> (see section
<a href="#Section-File-Callback">3.4</a>). Semantics identical to the previous
two functions. There is no need to install a file callback if this
feature isn’t used.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUSetFNameFilter (void *opaque, char * (*func) ())</span></span></dt><dd class="dd-description"> <br>
Sets the Filename Filter function to <span style="font-family:monospace">func</span> (see section
<a href="#Section-FName-Filter">3.5</a>). Semantics identical to the previous
three functions. If no filename filter is installed, any filename is
accepted. This may result in failures to write a file because of an
invalid filename.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char * UUFNameFilter (char *fname)</span></span></dt><dd class="dd-description"> <br>
Calls the current filename filter on <span style="font-family:monospace">fname</span>. This function is
provided so that certain parts of applications do not need to know
which filter is currently installed. This is handy for applications
that are supposed to run on more than one system. If no filename
filter is installed, the string itself is returned. Since a filename
filter may return a pointer to static memory or a pointer into the
parameter, the result from this function must not be written to.
</dd></dl>
<!--TOC section id="sec24" Decoding Functions-->
<h2 id="sec24" class="section">8  Decoding Functions</h2><!--SEC END --><p>Now for the more useful functions. The functions within this section
are everything you need to scan and decode files.</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">int UULoadFile (char *fname, char *id, int delflag)</span></span></dt><dd class="dd-description"> <br>
Scans a file for encoded data and inserts the result into the file
list. Each input file must only be scanned once; it may contain many
parts as well as multiple encoded files, thus it is possible that many
decodeable files are found after scanning one input file. On the other
hand it is also possible that <em>no</em> decodeable data is
found. There is no limit to the number of files.<sup><a id="text7" href="#note7">7</a></sup><p>If <span style="font-family:monospace">id</span> is non-NULL, its value is used instead of the filename,
and the file callback is used to map the id back into a filename
whenever this input file is needed again. If <span style="font-family:monospace">id</span> <em>is</em>
<span style="font-family:monospace">NULL</span>, then the input file must not be deleted or modified
until <span style="font-family:monospace">UUCleanUp</span> has been called.</p><p>If <span style="font-family:monospace">delflag</span> is non-zero, the input file will automatically be
removed within <span style="font-family:monospace">UUCleanUp</span>. This is useful when the decoder’s
input are also temporary files – this way, the application can forget
about them right after they’re “loaded”. The value of
<span style="font-family:monospace">delflag</span> is ignored, however, if <span style="font-family:monospace">id</span> is non-NULL;
combining both options does not make sense.</p><p>The behavior of this function is influenced by some of the options,
most notably <span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">FAST</span>. The two most
probable return values are <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span>, indicating
successful completion, or <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">IOERR</span> in case of some
error while reading the file. The other return values are less likely
to appear.</p><p>Note that files are even scheduled for destruction if an error
<em>did</em> happen during scanning (with the exception of a file that
could not be opened). But error handling is slightly problematic here
anyway, since it might be possible that useful data was found before
the error occurred.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UULoadFileWithPartNo (char *fname, char *id, int delflag, int partno)</span></span></dt><dd class="dd-description"> <br>
Same as above, but assigns a part number to the data in the file. This
function can be used if the callee is certain of the part number and
there is thus no need to depend on UUDeview’s heuristics. However, it
must not be used if the referenced file may contain more than one
piece of encoded data.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">uulist * UUGetFileListItem (int num)</span></span></dt><dd class="dd-description"> <br>
Returns a pointer to the <span style="font-family:monospace">num</span>th item of the file list. The
elements of this structure are described in section <a href="#file-list">4</a>.
The list is zero-based. If <span style="font-family:monospace">num</span> is out-of-range, <span style="font-family:monospace">NULL</span>
is returned. Usage of this function is pretty straightforward: loop
with an increasing value until <span style="font-family:monospace">NULL</span> is returned. The
structure must not be modified by the application itself. Also, none
of the structure’s value should be “cached” elsewhere, as they are
not constant: they may change after each loaded file.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UURenameFile (uulist *item, char *newname)</span></span></dt><dd class="dd-description"> <br>
Renames one item of the file list. The old name is discarded and
replaced by <span style="font-family:monospace">newname</span>. The new name is copied and may thus
point to volatile memory. The name should be a local filename without
any directory information, which would be stripped by the filename
filter anyway.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUDecodeToTemp (uulist *item)</span></span></dt><dd class="dd-description"> <br>
Decodes the given item of the file list and places the decoded output
into a temporary file. This is intended to allow “previews” of an
encoded file without copying it to its final location (which would
probably overwrite other files). The name of the temporary file can be
retrieved afterwards by re-retrieving the node of the file list and
looking at its <span style="font-family:monospace">binfile</span> member.<p><span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span> is returned upon successful completion. Most
other error codes can occur, too. <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">NODATA</span> is
returned if you try to decode parts without encoded data or with a
missing beginning (<em>uuencoded</em> and <em>xxencoded</em> files only)
– of course, this condition would also have been obvious from the
<span style="font-family:monospace">state</span> value of the file list structure.</p><p>The setting of <span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">DESPERATE</span> changes the behavior if
an unexpected end of file was found (usually meaning that one or more
parts are missing). Normally, the partially-written target file is
removed and the value <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">NOEND</span> is returned. In
desperate mode, the same error code is returned, but the target file
is not removed.</p><p>The target file is removed in all other error conditions.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UURemoveTemp (uulist *item)</span></span></dt><dd class="dd-description"> <br>
After a file has been decoded into a temporary file and is needed no
longer, this function can be called to free the disk space immediately
instead of having to wait until <span style="font-family:monospace">UUCleanUp</span>. If a decode
operation is called for later on, the file will simply be recreated.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUDecodeFile (uulist *item, char *target)</span></span></dt><dd class="dd-description"> <br>
This is the function you have been waiting for. The file is decoded
and copied to its final location. Calling <span style="font-family:monospace">UUDecodeToTemp</span>
beforehand is not required. If <span style="font-family:monospace">target</span> is non-NULL, then it is
immediately used as filename for the target file (without prepending
the save path and without passing it through the filename
filter). Otherwise, if <span style="font-family:monospace">target==NULL</span>, the final filename is
composed by concatenating the save path and the result of the filename
filter used upon the filename found in the encoded file.<p>If the target file already exists, the value of the
<span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OVERWRITE</span> option is checked. If it is false
(zero), then the error <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">EXISTS</span> is generated and
decoding fails. If the option is true, the target file is silently
overwritten.<sup><a id="text8" href="#note8">8</a></sup></p><p>The file is first decoded into a temporary file, then the temporary
file is copied to the final location. This is done to prevent
overwriting target files with data that turns out too late to be
invalid.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUInfoFile (uulist *item, void *opaque, int (*func) ())</span></span></dt><dd class="dd-description"> <br>
This function can be used to query information about the encoded
file. This is either the zeroth part of a file if available, or the
beginning of the first part up to the encoded data otherwise. Once
again, a callback function is used to do the job. <span style="font-family:monospace">func</span> must
be a function with two parameters. The first one is an opaque data
pointer (the value of <span style="font-family:monospace">opaque</span>), the other is one line of info
about the file (at maximum, 512 bytes). The callback is called for
each line of info.<p>The callback can return either zero, meaning that it can accept more
data, or non-zero, which immediately stops retrieval of more
information.</p><p>Usually, the opaque pointer holds some information about a text
window, so that the callback knows where to print the next line. In
a terminal-oriented application, the user can be queried each 25th
line and the callback can return non-zero if the user doesn’t wish to
continue.</p></dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int UUSmerge (int pass)</span></span></dt><dd class="dd-description"> <br>
Attempts a “Smart Merge” of parts that seem to belong to different
files but which <em>could</em> belong to the same. Occasionally, you
will find a posting with parts 1 to 3 and 5 to 8 of “picture.gif”
and part 4 of “picure.gif” (note the spelling). To the human, it is
obvious that these parts belong together, to a machine, it is
not. This function attempts to detect these conditions and merge the
appropriate parts together. This function must be called repeatedly
with increasing values for “pass”: With <span style="font-family:monospace">pass==0</span>, only
immediate fits are merged, increasing values allow greater
“distances” between part numbers,<p>This function is a bunch of heuristics, and I don’t really trust
them. In some cases, the “smart” merge may do more harm than
good. This function should only be called as last resort on explicit
user request. The first call should be made with <span style="font-family:monospace">pass==0</span>,
then with <span style="font-family:monospace">pass==1</span> and at last with <span style="font-family:monospace">pass=99</span>.
</p></dd></dl>
<!--TOC section id="sec25" Encoding Functions-->
<h2 id="sec25" class="section">9  Encoding Functions</h2><!--SEC END --><p>There are a couple of functions to encode data into a file. You will
usually need no more than one of them, depending on the job you want
to do. The functions also differ in the headers they generate. Some
functions do generate full MIME-compliant headers. This may sound like
the best choice, but it’s not always the wisest choice. Please follow
the following guidelines.</p><ul class="itemize"><li class="li-itemize">
Do not produce MIME-compliant messages if you cannot guarantee their
proper handling. For example, if you create a MIME-compliant message
on disk, and the user <em>includes</em> this file in a text message, the
headers produced for the encoded data become not part of the final
message’s header but are just included in the message body. The
resulting message will <em>not</em> be MIME-compliant!
</li><li class="li-itemize">Take it from the author that slightly-different-than-MIME messages
give the recipient much worse headaches than messages that do not try
to be MIME in the first place.
</li><li class="li-itemize">Because of that, headers should <em>only</em> be generated if the
application itself handles the final mailing or posting of the
message. Do not rely on user actions.
</li><li class="li-itemize">Do not encode to <em>Base64</em> outside of MIME messages. Because some
information like the filename is only available in the MIME-message
framework, <em>Base64</em> doesn’t make much sense without it.
</li><li class="li-itemize">However, if you can guarantee proper MIME handling, <em>Base64</em>
should be favored above the other types of encoding. Most
MIME-compliant applications do not know the other encoding types.
</li></ul><p>All of the functions have a bunch of parameters for greater
flexibility. Don’t be confused by their number, usually you’ll need to
fill only a few of them. There’s a number of common parameters which
can be explained separately:</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">FILE *outfile</span></span></dt><dd class="dd-description"> <br>
The output stream, where the encoded data is written to.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">FILE *infile, char *infname</span></span></dt><dd class="dd-description"> <br>
Where the input data shall be read from. Only one of both values must
be specified, the other can be NULL.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *outfname</span></span></dt><dd class="dd-description"> <br>
The name by which the recipient will receive the file. It is used on
the “begin” line for <em>uuencoded</em> and <em>xxencoded</em> data, and
in the headers of MIME-formatted messages. If this parameter is NULL,
it defaults to <span style="font-family:monospace">infname</span>. It must be specified if data is read
from a stream and <span style="font-family:monospace">infname==NULL</span>.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int filemode</span></span></dt><dd class="dd-description"> <br>
For <em>uuencoded</em> and <em>xxencoded</em> data, the file permissions
are encoded into the “begin” line. This mode can be specified
here. If the value is 0, it will be determined by performing a
<span style="font-family:monospace">stat()</span> call on the input file. If this call should fail, a
value of 0644 is used as default.
</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int encoding</span></span></dt><dd class="dd-description"> <br>
The encoding to use. One of the three constants <span style="font-family:monospace">UU</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">ENCODED</span>,
<span style="font-family:monospace">XX</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">ENCODED</span> or <span style="font-family:monospace">B64ENCODED</span>.
</dd></dl><p>Now for the functions …</p><dl class="description"><dt class="dt-description">
<table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">int UUEncodeMulti</span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">(FILE *outfile, FILE *infile, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *infname, int encoding, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *outfname, char *mimetype, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> int filemode) </span></span></td></tr>
</table></dt><dd class="dd-description"> <br>
Encodes data into a subpart of a MIME “multipart” message.
Appropriate “Content-Type” headers are produced, followed by
the encoded data. The application must provide the envelope and
boundary lines. If <span style="font-family:monospace">mimetype!=NULL</span>, it is used as value
for the “Content-Type” field, otherwise, the extension from
<span style="font-family:monospace">outfname</span> or <span style="font-family:monospace">infname</span> (if <span style="font-family:monospace">outfname==NULL</span>)
is used to look up the relevant type name.</dd><dt class="dt-description"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">int UUEncodePartial</span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">(FILE *outfile, FILE *infile, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *infname, int encoding, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *outfname, char *mimetype, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> int filemode, int partno, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> long linperfile) </span></span></td></tr>
</table></dt><dd class="dd-description"> <br>
Encodes data as the body of a MIME “message/partial” message. This
type allows message fragmentation. This function must be called
repetitively until it runs out of input data. The application must
provide a valid envelope with a “message/partial” content type and
proper information about the part numbers.<p>Each call produces <span style="font-family:monospace">linperfile</span> lines of encoded output. For
<em>uuencoded</em> and <em>xxencoded</em> files, each output line encodes
45 bytes of input data, each <em>Base64</em> line encodes 57 bytes.
If <span style="font-family:monospace">linperfile==0</span>, this function is equivalent to
<span style="font-family:monospace">UUEncodeMulti</span>.</p><p>Different handling is necessary when reading from an input stream
(if <span style="font-family:monospace">infile!=NULL</span>) compared to reading from a file
(if <span style="font-family:monospace">infname!=NULL</span>). In the first case, the function must be
called until <span style="font-family:monospace">feof()</span> becomes true on the input file, or an
error occurs. In the second case, the file will be opened
internally. Instead of <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span>, a value of
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">CONT</span> is returned for all but the last part.</p></dd><dt class="dt-description"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">int UUEncodeToStream</span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">(FILE *outfile, FILE *infile, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *infname, int encoding, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *outfname, int filemode) </span></span></td></tr>
</table></dt><dd class="dd-description"> <br>
Encodes the input data and sends the plain output without any
headers to the output stream. Be aware that for <em>Base64</em>, the
output does not include any information about the filename.</dd><dt class="dt-description"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">int UUEncodeToFile</span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">(FILE *infile, char *infname, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> int encoding, char *outfname, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *diskname, long linperfile) </span></span></td></tr>
</table></dt><dd class="dd-description"> <br>
Encodes the input data and writes the output into one or more output
files on the local disk. No headers are generated. If
<span style="font-family:monospace">diskname==NULL</span>, the names of the encoded files are generated
by concatenating the save path (see the <span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">SAVEPATH</span>
option) and the base name of <span style="font-family:monospace">outfname</span> or <span style="font-family:monospace">infname</span>
(if <span style="font-family:monospace">outfname==NULL</span>).<p>If <span style="font-family:monospace">diskname!=NULL</span> and does not contain directory information,
the target filename is the concatenation of the save path and
<span style="font-family:monospace">diskname</span>. If <span style="font-family:monospace">diskname</span> is an absolute path name, it
is used itself.</p><p>From the so-generated target filename, the extension is stripped. For
single-part output files, the extension set with the
<span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">ENCEXT</span> option is used. Otherwise, the three-digit
part number is used as extension. If the destination file does already
exist, the value of the <span style="font-family:monospace">UUOPT</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OVERWRITE</span> is checked; if
overwriting is not allowed, encoding fails with
<span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">EXISTS</span>.</p></dd><dt class="dt-description"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">int UUE_PrepSingle</span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">(FILE *outfile, FILE *infile, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *infname, int encoding, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *outfname, int filemode, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *destination, char *from, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *subject, int isemail) </span></span></td></tr>
</table></dt><dd class="dd-description"> <br>
Produces a complete MIME-formatted message including all necessary
headers. The output from this function is usually fed directly into a
mail delivery agent which honors headers (like “sendmail” or
“inews”).<p>If <span style="font-family:monospace">from!=NULL</span>, it is sent as the sender’s email address
in the “From” header field. Some MDA programs are able to provide
the sender’s address themselves, so this value may be NULL in certain
cases.</p><p>If <span style="font-family:monospace">subject!=NULL</span>, the text is included in the “Subject”
header field. The subject is extended with information about the file
name and part number (in this case, always “(001/001)”).</p><p>“Destination” must not be NULL. Depending on the “isemail” flag,
its contents are sent either in the “To” or “Newsgroups” header
field.</p></dd><dt class="dt-description"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">int UUE_PrepPartial</span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace">(FILE *outfile, FILE *infile, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *infname, int encoding, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *outfname, int filemode, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> int partno, long linperfile, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> long filesize, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *destination, char *from, </span></span></td></tr>
<tr><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> </span></span></td><td style="text-align:left;white-space:nowrap" ><span style="font-weight:bold"><span style="font-family:monospace"> char *subject, int isemail) </span></span></td></tr>
</table></dt><dd class="dd-description"> <br>
Similar to <span style="font-family:monospace">UUE_PrepSingle</span>, but produces a complete
MIME-formatted “message/partial” message including all necessary
headers. The function must be called repetitively until it runs
out of input data. For more explanations, see the description of the
function <span style="font-family:monospace">UUEncodePartial</span> above.<p>The only additional parameter is <span style="font-family:monospace">filesize</span>. Usually, this
value can be 0, as the size of the input file can usually be
determined by performing a <span style="font-family:monospace">stat()</span> call. However, this might
not be possible if <span style="font-family:monospace">infile</span> refers to a pipe. In that case, the
value of <span style="font-family:monospace">filesize</span> is used.</p><p>If the size of the input data cannot be determined, and
<span style="font-family:monospace">filesize</span> is 0, the function refuses encoding into multiple
files and produces only a single stream of output.</p><p>If data is read from a file instead from a stream
(<span style="font-family:monospace">infile==NULL</span>), the function opens the file internally and
returns <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">CONT</span> instead of <span style="font-family:monospace">UURET</span><span style="font-family:monospace">-</span><span style="font-family:monospace">_</span><span style="font-family:monospace">OK</span> on
successful completion for all but the last part.
</p></dd></dl>
<!--TOC section id="sec26" The Trivial Decoder-->
<h2 id="sec26" class="section">10  The Trivial Decoder</h2><!--SEC END --><p>In this section, we implement and discuss the “Trivial Decoder”,
which illustrates the use of the decoding functions. We start with the
absolute minimum and then add more features and actually end up with a
limited, but useful tool. For a full-scale frontend, look at the
implementation of the “UUDeview” program. The sample code can be
found among the documentation files as <span style="font-family:monospace">td-v1.c</span>,
<span style="font-family:monospace">td-v2.c</span> and <span style="font-family:monospace">td-v3.c</span>.</p>
<!--TOC subsection id="sec27" Version 1-->
<h3 id="sec27" class="subsection">10.1  Version 1</h3><!--SEC END --><blockquote class="figure"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<span style="font-size:small">
</span><hr style="height:2"><span style="font-size:small">
</span><pre class="verbatim"><span style="font-size:small">#include <stdio.h>
#include <stdlib.h>
#include <config.h>
#include <uudeview.h>
int main (int argc, char *argv[])
{
UUInitialize ();
UULoadFile (argv[1], NULL, 0);
UUDecodeFile (UUGetFileListItem (0), NULL);
UUCleanUp ();
return 0;
}
</span></pre><hr style="height:2"><span style="font-size:small">
</span>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Figure 2: The “Trivial Decoder”, Version 1</td></tr>
</table></div>
<a id="td-v1"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>The minimal decoding program is displayed in Figure <a href="#td-v1">2</a>. Only
four code lines are needed for the implementation. <span style="font-family:monospace"><stdlib.h></span>
defines <span style="font-family:monospace">NULL</span>, <span style="font-family:monospace"><uudeview.h></span> declares the decoding
library functions, and <span style="font-family:monospace"><config.h></span>, the library’s
configuration file, is needed for some configuration
details<sup><a id="text9" href="#note9">9</a></sup>.</p><p>After initialization, the file given as first command line parameter
is scanned. No symbolic name is assigned to the file, so that we don’t
need a file callback. After the scanning, the encoded file is decoded
and stored in the current directory by its native name.</p><p>Of course, there is much to complain about:
</p><ul class="itemize"><li class="li-itemize">
No error checking is done. For example, does the input file exist?
</li><li class="li-itemize">Only a single file can be scanned for encoded data.
</li><li class="li-itemize">If more than one encoded file is found, only the first one is
decoded, the others are ignored.
</li><li class="li-itemize">No checking is done if there actually <em>is</em> encoded data in
the file and whether this data is valid.
</li></ul>
<!--TOC subsection id="sec28" Version 2-->
<h3 id="sec28" class="subsection">10.2  Version 2</h3><!--SEC END --><blockquote class="figure"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<span style="font-size:small">
</span><hr style="height:2"><span style="font-size:small">
</span><pre class="verbatim"><span style="font-size:small">#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <config.h>
#include <uudeview.h>
int main (int argc, char *argv[])
{
uulist *item;
int i, res;
UUInitialize ();
for (i=1; i<argc; i++)
if ((res = UULoadFile (argv[i], NULL, 0)) != UURET_OK)
fprintf (stderr, "could not load %s: %s\n",
argv[i], (res==UURET_IOERR) ?
strerror (UUGetOption (UUOPT_ERRNO, NULL, NULL, 0)) :
UUstrerror(res));
for (i=0; (item=UUGetFileListItem(i)) != NULL; i++) {
if ((item->state & UUFILE_OK) == 0)
continue;
if ((res = UUDecodeFile (item, NULL)) != UURET_OK) {
fprintf (stderr, "error decoding %s: %s\n",
(item->filename==NULL)?"oops":item->filename,
(res==UURET_IOERR) ?
strerror (UUGetOption (UUOPT_ERRNO, NULL, NULL, 0)) :
UUstrerror(res));
}
else {
printf ("successfully decoded '%s'\n", item->filename);
}
}
UUCleanUp ();
return 0;
}
</span></pre><hr style="height:2"><span style="font-size:small">
</span>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Figure 3: The “Trivial Decoder”, Version 2</td></tr>
</table></div>
<a id="td-v2"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>The second version, printed in figure <a href="#td-v2">3</a>, addresses all of
the above problems. The code size more than tripled, but that’s
largely because of the error messages.</p><p>All files given on the command
line are scanned<sup><a id="text10" href="#note10">10</a></sup>, and all encoded files are decoded. Of course, it is now
also possible for an encoded file to span its parts over more than one
input file. Appropriate error messages are printed upon failure of any
step, and a success message is printed for successfully decoded files.</p><p>Apart from the program’s unfriendliness that there is no
user-interaction like selective decoding of files, choice of a target
directory etc., there are only three more items to complain about:
</p><ul class="itemize"><li class="li-itemize">
Errors and other messages produced within the library aren’t
displayed because there’s no message callback.
</li><li class="li-itemize">No filename filter is installed, so decoding of files with
invalid filenames will fail; this especially includes filenames
with directory information.
</li><li class="li-itemize">No information is printed for invalid encoded files, or files
with missing parts (they’re simply skipped).
</li></ul>
<!--TOC subsection id="sec29" Version 3-->
<h3 id="sec29" class="subsection">10.3  Version 3</h3><!--SEC END --><blockquote class="figure"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<span style="font-size:small">
</span><hr style="height:2"><span style="font-size:small">
</span><span style="font-size:small"><em>… right after the #includes</em></span><span style="font-size:small"> <br>
</span><pre class="verbatim"><span style="font-size:small">#include <fptools.h>
void MsgCallBack (void *opaque, char *msg, int level)
{
fprintf (stderr, "%s\n", msg);
}
char * FNameFilter (void *opaque, char *fname)
{
static char dname[13];
char *p1, *p2;
int i;
if ((p1 = _FP_strrchr (fname, '/')) == NULL)
p1 = fname;
if ((p2 = _FP_strrchr (p1, '\\')) == NULL)
p2 = p1;
for (i=0, p1=dname; *p2 && *p2!='.' && i<8; i++)
*p1++ = (*p2==' ')?(p2++,'_'):*p2++;
while (*p2 && *p2 != '.') p2++;
if ((*p1++ = *p2++) == '.')
for (i=0; *p2 && *p2!='.' && i<3; i++)
*p1++ = (*p2==' ')?(p2++,'_'):*p2++;
*p1 = '\0';
return dname;
}
</span></pre><span style="font-size:small"><em>… within </em></span><span style="font-size:small"><em><span style="font-family:monospace">main()</span></em></span><span style="font-size:small"><em> after </em></span><span style="font-size:small"><em><span style="font-family:monospace">UUInitialize</span></em></span><span style="font-size:small"> <br>
</span><pre class="verbatim"><span style="font-size:small"> UUSetMsgCallback (NULL, MsgCallBack);
UUSetFNameFilter (NULL, FNameFilter);
</span></pre><span style="font-size:small"><em>… replacing the main loop’s </em></span><span style="font-size:small">else</span><span style="font-size:small"> <br>
</span><pre class="verbatim"><span style="font-size:small"> else {
printf ("successfully decoded '%s' as '%s'\n",
item->filename,
UUFNameFilter (item->filename));
}
</span></pre><hr style="height:2"><span style="font-size:small">
</span>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Figure 4: Changes for Version 3</td></tr>
</table></div>
<a id="td-v3-diff"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>This last section adds a simple filename filter (targeting at a DOS
system with 8.3 filenames) and a simple
message callback, which just dumps messages to the console. Figure
<a href="#td-v3-diff">4</a> lists the changes with respect to version 2 (for the
full listing, refer to the source file on disk).</p><p>The message callback, a one-liner, couldn’t be simpler. The filename
filter will probably not win an award for good programming style, but
it does its job of stripping Unix-style or DOS-style directory names
and using only the first 8 characters of the base filename and the
first three characters of the extension. If the filename contains
space characters, they’re replaced by underscores. Note that
<span style="font-family:monospace">dname</span>, the storage for the resulting filename, is declared
static, as it must be accessible after the filter function has
returned.</p><p>For portability, the filename filter uses a replacement function from
the <span style="font-family:monospace">fptools</span> library instead of relying of a native implementation
of the <span style="font-family:monospace">strrchr</span> function.</p><p>Both callbacks are installed right after initializing the
library. Since now the filename of the decoded file may be
different from the filename of the file list structure, we recreate
the resulting filename by calling the filename filter ourselves for
display, so that the user knows where to look for the file.</p>
<!--TOC section id="sec30" Replacement functions-->
<h2 id="sec30" class="section">11  Replacement functions</h2><!--SEC END --><p>
<a id="chap-rf"></a></p><p>This section is a short reference for the replacement functions from
the <span style="font-family:monospace">fptools</span> library. Some of them may be useful in the
application code as well. Most of these functions are pretty standard
in modern systems, but there’s also a few from the author’s
imagination. Each of the functions is tagged with information why this
replacement exists:
</p><ul class="itemize"><li class="li-itemize">
“nonstandard” (ns): this function is available on some systems, but
not on others. Functions with this tag could be safely replaced with a
native implementation.
</li><li class="li-itemize">“feature” (f): the replacement adds some functionality with
respect to the “original”.
</li><li class="li-itemize">“author”(a): just a function the author considered useful.
</li></ul><dl class="description"><dt class="dt-description">
<span style="font-weight:bold"><span style="font-family:monospace">void _FP_free (void *)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(f)</span> <br>
ANSI C guarantees that <span style="font-family:monospace">free()</span> can be safely called with a
<span style="font-family:monospace">NULL</span> argument, but some old systems dump core. This
replacement just ignores a <span style="font-family:monospace">NULL</span> pointer and passes anything
else to the original <span style="font-family:monospace">free()</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strdup (char *ptr)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Allocates new storage for the string <span style="font-family:monospace">ptr</span> and copies the
string including the final nullbyte to the new location (thus
“duplicating” the string). Returns <span style="font-family:monospace">NULL</span> if the
<span style="font-family:monospace">malloc()</span> call fails.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strncpy (char *dest, char *src, int count)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(f)</span> <br>
Copies text from the <span style="font-family:monospace">src</span> area to the <span style="font-family:monospace">dest</span> area,
until either a nullbyte has been copied or <span style="font-family:monospace">count</span> bytes have
been copied. Differs from the original in that if <span style="font-family:monospace">src</span> is
longer than <span style="font-family:monospace">count</span> bytes, then only <span style="font-family:monospace">count</span>-1 bytes are
copied, and the destination area is properly terminated with a
nullbyte.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">void *_FP_memdup (void *ptr, int count)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Allocates a new area of <span style="font-family:monospace">count</span> bytes, which are then copied
from the <span style="font-family:monospace">ptr</span> area.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int _FP_stricmp (char *str1, char *str2)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Case-insensitive equivalent of <span style="font-family:monospace">strcmp</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int _FP_strnicmp (char *str1, char *str2, int count)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Case-insensitive equivalent of <span style="font-family:monospace">strncmp</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strrchr (char *string, int chr)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Similar to <span style="font-family:monospace">strchr</span>, but returns a pointer to the last
occurrence of the character <span style="font-family:monospace">chr</span> in <span style="font-family:monospace">string</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strstr (char *str1, char *str2)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Returns a pointer to the first occurrence of <span style="font-family:monospace">str2</span> in
<span style="font-family:monospace">str1</span> or <span style="font-family:monospace">NULL</span> if the second string does not appear
within the first.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strrstr (char *str1, char *str2)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Similar to <span style="font-family:monospace">strstr</span>, but returns a pointer to the last
occurrence of <span style="font-family:monospace">str2</span> in <span style="font-family:monospace">str1</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_stristr (char *str1, char *str2)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Case-insensitive equivalent of <span style="font-family:monospace">strstr</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strirstr (char *str1, char *str2)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Case-insensitive equivalent of <span style="font-family:monospace">strrstr</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_stoupper (char *string)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Converts all alphabetic characters in <span style="font-family:monospace">string</span> to uppercase.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_stolower (char *string)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Converts all alphabetic characters in <span style="font-family:monospace">string</span> to lowercase.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">int _FP_strmatch (char *str, char *pat)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Performs glob-style pattern matching. <span style="font-family:monospace">pat</span> is a string
containing regular characters and the two wildcards ’?’
(question mark) and ’*’. The question mark matches any single
character, the ’*’ matches any zero or more characters. If
<span style="font-family:monospace">str</span> is matched by <span style="font-family:monospace">pat</span>, the function returns 1,
otherwise 0.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_fgets (char *buf, int max, FILE *file)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(f)</span> <br>
Extends the standard <span style="font-family:monospace">fgets()</span>; this replacement is able to
handle line terminators from various systems. DOS text files have
their lines terminated by CRLF, Unix files by LF only and Mac files by
CR only. This function reads a line and replaces whatever line
terminator present with a single LF.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strpbrk (char *str, char *accept)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Locates the first occurrence in the string <span style="font-family:monospace">str</span> of any of
the characters in <span style="font-family:monospace">accept</span>.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strtok (char *str, char *del)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
Considers the string <span style="font-family:monospace">str</span> to be a sequence of tokens separated
by one or more of the delimiter characters given in <span style="font-family:monospace">del</span>. Upon
first call with <span style="font-family:monospace">str!=NULL</span>, returns the first token. Later
calls with <span style="font-family:monospace">str==NULL</span> return the following tokens. Returns
<span style="font-family:monospace">NULL</span> if no more tokens are found.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_cutdir (char *str)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(a)</span> <br>
Returns the filename part of <span style="font-family:monospace">str</span>, meaning everything after
the last slash or backslash in the string. Now replaced with the
concept of the filename filter.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_strerror (int errcode)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
A rather dumb replacement of the original one, which transforms error
codes from <span style="font-family:monospace">errno</span> into a human-readable error message. This
function should <em>only</em> be used if no native implementation
exists; it just returns a string with the numerical error number.</dd><dt class="dt-description"><span style="font-weight:bold"><span style="font-family:monospace">char *_FP_tempnam (char *dir, char *pfx)</span></span></dt><dd class="dd-description"> <span style="font-size:small">(ns)</span> <br>
The original is supposed to return a unique filename. The temporary
file should be stored in <span style="font-family:monospace">dir</span> and have a prefix of
<span style="font-family:monospace">pfx</span>. This replacement, too, should only be used if no native
implementation exists. It just returns a temporary filename created by
the standard <span style="font-family:monospace">tmpnam()</span>, which not necessarily resides in a
proper <span style="font-family:monospace">TEMP</span> directory. The value returned by this function is
an allocated memory area which must later be freed by calling
<span style="font-family:monospace">free</span>.
</dd></dl>
<!--TOC section id="sec31" Known Problems-->
<h2 id="sec31" class="section">12  Known Problems</h2><!--SEC END --><p>This section mentions a few known problems with the library, which the
author considers to be “features” rather than “bugs”, meaning that
they probably won’t be “fixed” in the near future.</p><ul class="itemize"><li class="li-itemize">
Encoding to <em>BinHex</em> is not yet supported.
</li><li class="li-itemize">The checksums found in <em>BinHex</em> files are ignored.
</li><li class="li-itemize">If both data and resource forks in a <em>BinHex</em> file are
non-empty, the larger one is decoded. Non-Mac systems can only use one
of them anyway (usually the “data” fork, the “resource” fork
usually contains M68k or PPC machine code).
</li></ul><!--TOC section id="sec32" References-->
<h2 id="sec32" class="section">References</h2><!--SEC END --><dl class="thebibliography"><dt class="dt-thebibliography">
<a id="rfc0822">[RFC0822]</a></dt><dd class="dd-thebibliography"> Crocker, D., “Standard for the Format of
ARPA Internet Text Messages”, RFC 822, Network Working Group, August
1982.
</dd><dt class="dt-thebibliography"><a id="rfc1521">[RFC1521]</a></dt><dd class="dd-thebibliography"> Borenstein, N., “MIME (Multipurpose
Internet Mail Extensions) Part One”, RFC 1521, Network Working Group,
September 1993.
</dd><dt class="dt-thebibliography"><a id="rfc1741">[RFC1741]</a></dt><dd class="dd-thebibliography"> Faltstrøm, P., Crocker, D. and Fair, E.,
“MIME Content Type for BinHex Encoded Files”, RFC 1741, Network
Working Group, December 1994.
</dd><dt class="dt-thebibliography"><a id="rfc1806">[RFC1806]</a></dt><dd class="dd-thebibliography"> Troost, R., Dorner, S., “The
Content-Disposition Header”, RFC 1806, Network Working Group, June
1995.
</dd></dl><p>RFC documents (“Request for Comments”) can be downloaded from many
ftp sites around the world.</p>
<!--TOC section id="sec33" Encoding Formats-->
<h2 id="sec33" class="section">A  Encoding Formats</h2><!--SEC END --><p>The following sections describe the four most widely used formats
for encoding binary data into plain text, <em>uuencoding</em>,
<em>xxencoding</em>, <em>Base64</em> and <em>BinHex</em>. Another section
shortly mentions <em>Quoted-Printable</em> encoding.</p><p>Other formats exist, like <em>btoa</em> and <em>ship</em>, but they are
not mentioned here. <em>btoa</em> is much less efficient than the
others. <em>ship</em> is slightly more efficient and will probably be
supported in future.</p><p>Uuencoding, xxencoding and Base 64 basically work the same. They are
all “three in four” encodings, which means that they take three
octets<sup><a id="text11" href="#note11">11</a></sup> from the input file and encode them into four
characters.</p><blockquote class="table"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<table border=1 style="border-spacing:0;" class="cellpadding1"><tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Input Octet</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Input Bit</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >7</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Output Data #1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Output Data #2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Input Octet</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Input Bit</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >7</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Output Data #2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Output Data #3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Input Octet</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Input Bit</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >7</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Output Data #3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Output Data #4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0 </td></tr>
</table>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Table 1: Bit mapping for Three-in-Four encoding</td></tr>
</table></div>
<a id="tab-3-in-4"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>Three bytes are 24 bits, and they are divided into 4 sections of 6
bits each. Table <a href="#tab-3-in-4">1</a> describes in detail how the input
bits are copied into the output data bits. 6 bits can have values from
0 to 63; each of the “three in four” encodings now uses a character
table with 64 entries, where each possible value is mapped to a
specific character.</p><p>The advantage of three in four encodings is their simplicity, as
encoding and decoding can be done by mere bit shifting and two simple
tables (one for encoding, mapping values to characters, and one for
decoding, with the reverse mapping). The disadvantage is that the
encoded data is 33% larger than the input (not counting line breaks
and other information added to the encoded data).</p><p>The before-mentioned <em>ship</em> data is more effective; it is a
so-called <em>Base 85</em> encoding. Base 85 encodings take four input
bytes (32 bits) and encode them into five characters. Each of this
characters encode a value from 0 to 84; five characters can therefore
encode a value from 0 to 85<sup>5</sup>=4437053125, covering the complete 32
bit range. Base 85 encodings need more “complicated” math and a
larger character table, but result in only 25% bigger encoded files.</p><p>In order to illustrate the encodings and present some actual data, we
will present the following text encoded in each of the formats:</p><blockquote class="quote">
<span style="font-size:small">
</span><pre class="verbatim"><span style="font-size:small">This is a test file for illustrating the various
encoding methods. Let's make this text longer than
57 bytes to wrap lines with Base64 data, too.
Greetings, Frank Pilhofer
</span></pre>
</blockquote>
<!--TOC subsection id="sec34" Uuencoding-->
<h3 id="sec34" class="subsection">A.1  Uuencoding</h3><!--SEC END --><p>A document actually describing uuencoding as a standard does not seem
to exist. This is probably the reason why there are so many broken
encoders and decoders around that each take their liberties with the
definition.</p><p>The following text describe the pretty strict rules for uuencoding
that are used in the UUEnview encoding engine. The UUDeview decoding
engine is much more relaxed, according to the general rule that you
should be strict in all that you generate, but liberal in the data
that your receive.</p><p>Uuencoded data always starts with a <span style="font-family:monospace">begin</span> line and continues
until the <span style="font-family:monospace">end</span> line. Encoded data starts on the line following
the begin. Immediately before the <span style="font-family:monospace">end</span> line, there must be a
single <em>empty</em> line (see below).</p><blockquote class="quote">
<span style="font-size:small">
</span><span style="font-size:small"><span style="font-family:monospace">begin</span></span><span style="font-size:small"> </span><span style="font-size:small"><em>mode</em></span><span style="font-size:small"> </span><span style="font-size:small"><em>filename</em></span><span style="font-size:small"> <br>
… </span><span style="font-size:small"><em>encoded data</em></span><span style="font-size:small"> … <br>
</span><span style="font-size:small"><em>“empty” line</em></span><span style="font-size:small"> <br>
</span><span style="font-size:small"><span style="font-family:monospace">end</span></span><span style="font-size:small">
</span>
</blockquote>
<!--TOC subsubsection id="sec35" The <span style="font-family:monospace">begin</span> Line-->
<h4 id="sec35" class="subsubsection">A.1.1  The <span style="font-family:monospace">begin</span> Line</h4><!--SEC END --><p>The <span style="font-family:monospace">begin</span> line starts with the word <span style="font-family:monospace">begin</span> in the
first column. It is followed, all on the same line, by the
<em>mode</em> and the <em>filename</em>.</p><p><em>mode</em> is a three- or four-digit octal number, describing the
access permissions of the target file. This mode value is the same as
used with the Unix <span style="font-family:monospace">chmod</span> command and by the <span style="font-family:monospace">open</span>
system call. Each of the three digits is a binary or of the values 4
(read permission), 2 (write permission) and 1 (execute
permission). The first digit gives the user’s permissions, the second
one the permissions for the group the user is in, and the third digit
describes everyone else’s permissions. On DOS or other systems with
only a limited concept of file permissions, only the first digit
should be evaluated. If the “2” bit is not set, the resulting file
should be read-only, the “1” bit should be set for COM and EXE
files. Common values are <span style="font-family:monospace">644</span> or <span style="font-family:monospace">755</span>.</p><p><em>filename</em> is the name of the file. The name <em>should</em> be
without any directory information.</p>
<!--TOC subsubsection id="sec36" Encoded Data-->
<h4 id="sec36" class="subsubsection">A.1.2  Encoded Data</h4><!--SEC END --><p>The basic version of uencoding simply uses the ASCII characters 32-95
for encoding the 64 values of a three in for encoding. An
exception<sup><a id="text12" href="#note12">12</a></sup> is the value 0, which would normally map into the space
character (ASCII 32). To prevent problems with mailers that strip
space characters at the beginning or end of the line, character 96
“ ‘ ” is used instead. The encoding table is shown in table
<a href="#tab-uu">2</a>.</p><blockquote class="table"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<table border=1 style="border-spacing:0;" class="cellpadding1"><tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Data Value</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+7 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >‘</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >!</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >"</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >#</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >$</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >%</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >&</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >’ </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >(</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >)</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >*</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >,</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >-</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >.</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >/ </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 16</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >7 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 24</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >9</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >:</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >;</td><td style="text-align:center;border:solid 1px;white-space:nowrap" ><span style="font-family:monospace"><</span></td><td style="text-align:center;border:solid 1px;white-space:nowrap" >=</td><td style="text-align:center;border:solid 1px;white-space:nowrap" ><span style="font-family:monospace">></span></td><td style="text-align:center;border:solid 1px;white-space:nowrap" >?</td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 32</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >@</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >A</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >B</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >C</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >D</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >E</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >F</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >G </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 40</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >H</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >I</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >J</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >K</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >L</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >M</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >N</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >O </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 48</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >P</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >R</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >S</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >T</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >U</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >V</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >W </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 56</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >X</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Y</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Z</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >[</td><td style="text-align:center;border:solid 1px;white-space:nowrap" ><span style="font-family:monospace">\</span></td><td style="text-align:center;border:solid 1px;white-space:nowrap" >]</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >^</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >_ </td></tr>
</table>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Table 2: Encoding Table for Uuencoding</td></tr>
</table></div>
<a id="tab-uu"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>Each line of uuencoded data is prefixed, in the first column, with the
encoded number of encoded octets on this line. The most common prefix
that you’ll see is ‘M’. By looking up ‘M’ in table <a href="#tab-uu">2</a>, we
see that it represents the number 45. Therefore, this prefix means
that the line contains 45 octets (which are encoded into 60 (45/3*4)
plain-text characters).</p><p>In uuencoding, each line has the same length, normally, the length
(excluding the end of line character) is 61. Only the last line of
encoded data may be shorter.</p><p>If the input data is not a multiple of three octets long, the last
triple is filled up with (one or two) nulls. The decoder can determine
the number of octets that are to go into the output file from the
prefix.</p>
<!--TOC subsubsection id="sec37" The Empty Line-->
<h4 id="sec37" class="subsubsection">A.1.3  The Empty Line</h4><!--SEC END --><p>After the last line of data, there must be an <em>empty</em> line, which
must be a valid encoded line containing no encoded data. This is
achieved by having a line with the single character “ ‘ ” on it
(which is the prefix that encodes the value of 0 octets).</p>
<!--TOC subsubsection id="sec38" The <span style="font-family:monospace">end</span> Line-->
<h4 id="sec38" class="subsubsection">A.1.4  The <span style="font-family:monospace">end</span> Line</h4><!--SEC END --><p>The encoded file is then ended with a line consisting of the word
<span style="font-family:monospace">end</span>.</p>
<!--TOC subsubsection id="sec39" Splitting Files-->
<h4 id="sec39" class="subsubsection">A.1.5  Splitting Files</h4><!--SEC END --><p>Uuencoding does not describe a mechanism for splitting a file into two
or more messages for separate mailing or posting. Usually, the encoded
file is simply split into parts of more or less equal line
count<sup><a id="text13" href="#note13">13</a></sup>. Before the age of smart
decoders, the recipient had to manually concatenate the parts and
remove the headers in between, because the headers of mail messages
<em>might</em> just be valid uuencoded data lines, thus potentially
corrupting the data.</p>
<!--TOC subsubsection id="sec40" Variants of Uuencoding-->
<h4 id="sec40" class="subsubsection">A.1.6  Variants of Uuencoding</h4><!--SEC END --><p>There are many variations of the above rules which must be
taken into account in a decoder program. Here are the most
frequent:</p><ul class="itemize"><li class="li-itemize">
Many old encoders do not pay attention to the special rule of
encoding the 0 value, and encode it into a space character instead of
the “ ‘ ” character. This is not an “error,” but rather a
potential problem when mailing or posting the file.
</li><li class="li-itemize">Some encoders add a 62nd character to each encoded line:
sometimes a character looping from “a” to “z” over and over
again. This technique could be used to detect missing lines, but
confuses some decoders.
</li><li class="li-itemize">If the length of the input file is not a multiple of three, some
encoders omit the “unnecessary” characters at the end of the last
data line.
</li><li class="li-itemize">Sometimes, the “empty” data line at the end is omitted, and at
other times, the line is just completely empty (without the
“ ‘ ”).
</li></ul><p>There is also some confusion how to properly terminate a line. Most
encoders simply use the convention of the local system (DOS encoders
using CRLF, Unix encoders using LF, Mac encoders using CR), but with
respect to the MIME standard, the encoding library uses CRLF on all
systems. This causes a slight problem with some Unix decoders, which
look for “end” followed directly by LF (as four characters in
total). Such programs report “end not found”, but nevertheless
decode the file correctly.</p>
<!--TOC subsubsection id="sec41" Example-->
<h4 id="sec41" class="subsubsection">A.1.7  Example</h4><!--SEC END --><p>This is what our sample text looks like as uuencoded data:</p><pre class="verbatim"><span style="font-size:small">begin 600 test.txt
M5&AI<R!I<R!A('1E<W0@9FEL92!F;W(@:6QL=7-T<F%T:6YG('1H92!V87)I
M;W5S"F5N8V]D:6YG(&UE=&AO9',N($QE="=S(&UA:V4@=&AI<R!T97AT(&QO
M;F=E<B!T:&%N"C4W(&)Y=&5S('1O('=R87`@;&EN97,@=VET:"!"87-E-C0@
E9&%T82P@=&]O+@I'<F5E=&EN9W,L($9R86YK(%!I;&AO9F5R"@``
`
end
</span></pre>
<!--TOC subsection id="sec42" Xxencoding-->
<h3 id="sec42" class="subsection">A.2  Xxencoding</h3><!--SEC END --><p>The xxencoding method was conceived shortly after the initial use of
uuencoding. The first implementations of uuencoding did not realize
the potential problem of using the space character for encoding
data. Before this mistake was workarounded with the special case,
another author used a different charset for encoding, composed of
characters available on any system.</p><blockquote class="table"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<table border=1 style="border-spacing:0;" class="cellpadding1"><tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Data Value</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+7 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >-</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >7</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >9</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >A</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >B</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >C</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >D </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 16</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >E</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >F</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >G</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >H</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >I</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >J</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >K</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >L </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 24</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >M</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >N</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >O</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >P</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >R</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >S</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >T </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 32</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >U</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >V</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >W</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >X</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Y</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Z</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >a</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >b </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 40</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >c</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >d</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >e</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >f</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >g</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >h</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >i</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >j </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 48</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >k</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >l</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >m</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >n</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >o</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >p</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >r </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 56</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >s</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >t</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >u</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >v</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >w</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >x</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >y</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >z </td></tr>
</table>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Table 3: Encoding Table for Xxencoding</td></tr>
</table></div>
<a id="tab-xx"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>Xxencoding is absolutely identical to uuencoding with the difference
of using a different mapping of data values into printable characters
(table <a href="#tab-xx">3</a>). Instead of ‘M’, a normal-sized xxencoded line is
prefixed by ‘h’ (note that ‘h’ encodes 45, just as ‘M’ in uuencoding).
The empty data line at the end consists of a single ‘+’ character. Our
sample file looks like the following:</p><pre class="verbatim"><span style="font-size:small">begin 600 test.txt
hJ4VdQm-dQm-V65FZQrEUNaZgNG-aPr6UOKlgRLBoQa3oOKtb65FcNG-qML7d
hPrJn0aJiMqxYOKtb64pZR4VjN5Ai62lZR0Rn64pVOqIUR4VdQm-oNLVo64lj
hPaRZQW-oO43i0XIr647tR4Jn65Fj65RmML+UP4ZiNLAURqZoO0-0MLBZBXEU
ZN43oMGkUR4xj9Ud5QaJZR4ZiNrAg62NmMKtf63-dP4VjNaJm0U++
+
end
</span></pre>
<!--TOC subsection id="sec43" Base64 encoding-->
<h3 id="sec43" class="subsection">A.3  Base64 encoding</h3><!--SEC END --><p><em>Base 64</em> is part of the <em>MIME</em> (Multipurpose Internet Mail
Extensions) standard, described in [<a href="#rfc1521">RFC1521</a>], section 5.2. Sometimes,
it is incorrectly referred to as “MIME encoding”; however, the MIME
documents specify much more than just how to encode binary data. It
defines a complete framework for attachments within E-Mails. Being
part of a widely accepted standard, <em>Base64</em> has the advantage
of being the best-specified type of encoding.</p><blockquote class="table"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<table border=1 style="border-spacing:0;" class="cellpadding1"><tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Data Value</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+7 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >A</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >B</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >C</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >D</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >E</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >F</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >G</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >H </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >I</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >J</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >K</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >L</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >M</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >N</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >O</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >P </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 16</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >R</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >S</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >T</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >U</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >V</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >W</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >X </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 24</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Y</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Z</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >a</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >b</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >c</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >d</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >e</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >f </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 32</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >g</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >h</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >i</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >j</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >k</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >l</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >m</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >n </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 40</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >o</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >p</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >r</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >s</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >t</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >u</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >v </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 48</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >w</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >x</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >y</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >z</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 56</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >7</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >9</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >/ </td></tr>
</table>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Table 4: Encoding Table for Base64 Encoding</td></tr>
</table></div>
<a id="tab-b64"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>The general concept of three-in-four encoding is the same as with the
previous two types, just another new character table to represent the
values needs to be introduced (table <a href="#tab-b64">4</a>). Note that this
table differs from the <em>xxencoding</em> table only in a single
character (‘/’ versus ‘-’). If a line of encoding does not feature
either character, it may be difficult to tell which encoding is used
on the line.</p><p>The <em>Base64</em> encoding does not have “begin” and “end” lines;
such a concept is not needed, because the framework of a <em>MIME</em>
message defines the beginning and end of a part. The encoded data is
defined to be a “stream” of characters, and the decoder is supposed
to ignore any “illegal” characters in the stream (such as line
breaks or other whitespace). Each line must be shorter than 80
characters and terminated with a CRLF sequence. No particular line
length is enforced, but most implementations encode 57 octets into 76
encoded characters. Theoretically, a line might hold 79 characters,
although this would violate the rule of thumb that the line length is
a multiple of four (therefore encoding an integral number of
octets).<sup><a id="text14" href="#note14">14</a></sup></p><p>The end-of-file handling if the input data has not a multiple of three
octets is slightly different in <em>Base64</em> encoding than it is in
uuencoding. If one octet is left at the end of the input stream, the
data is padded with 4 zero bits (giving a total of 12 bits) and
encoded into two characters. After that, two equal signs ‘=’ are
written to complete the four character sequence. If two octets are
left, the data is padded with 2 zero bits (giving a total of 18 bits),
and encoded into three characters, after which a single equal sign ‘=’
is written.</p><p>Here’s our sample file in <em>Base64</em>. Note that this text is
<em>only</em> the encoded data. It is not a valid <em>MIME</em>
message. Without the required framework, no proper <em>MIME</em>
software will read it.</p><pre class="verbatim"><span style="font-size:small">VGhpcyBpcyBhIHRlc3QgZmlsZSBmb3IgaWxsdXN0cmF0aW5nIHRoZSB2YXJpb3VzCmVuY29kaW5n
IG1ldGhvZHMuIExldCdzIG1ha2UgdGhpcyB0ZXh0IGxvbmdlciB0aGFuCjU3IGJ5dGVzIHRvIHdy
YXAgbGluZXMgd2l0aCBCYXNlNjQgZGF0YSwgdG9vLgpHcmVldGluZ3MsIEZyYW5rIFBpbGhvZmVy
Cg==
</span></pre><p>For a more elaborate documentation of <em>Base64</em> encoding and
details of the <em>MIME</em> framework, I suggest reading [<a href="#rfc1521">RFC1521</a>].</p><p>The <em>MIME</em> standard also defines a way to split a message into
multiple parts so that re-assembly of the parts on the remote end is
easily possible. For details, see section 7.3.2, “The Message/Partial
subtype” of the standard.</p>
<!--TOC subsection id="sec44" BinHex encoding-->
<h3 id="sec44" class="subsection">A.4  BinHex encoding</h3><!--SEC END --><p>The <em>BinHex</em> encoding originates from the Macintosh environment,
and it takes the special properties of a Macintosh file into
account. There, a file has two parts or “forks”: the “resource”
fork holds machine code, and the “data” fork holds arbitrary
data. For files from other systems, the data fork is usually empty.</p><p>I have not found a “definitive” definition of the format. My
knowledge is based on two descriptions I found, one from Yves
Lempereur and another from Peter Lewis. A similar description can be
found in [<a href="#rfc1741">RFC1741</a>].</p><blockquote class="table"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<table border=1 style="border-spacing:0;" class="cellpadding1"><tr><td style="text-align:right;border:solid 1px;white-space:nowrap" >Data Value</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+2</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+7 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >!</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >"</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >#</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >$</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >%</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >&</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >’</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >( </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >)</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >*</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >+</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >,</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >-</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >0</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >1</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2 </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 16</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >3</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >4</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >5</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >6</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >8</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >9</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >@</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >A </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 24</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >B</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >C</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >D</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >E</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >F</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >G</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >H</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >I </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 32</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >J</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >K</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >L</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >M</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >N</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >P</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >R </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 40</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >S</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >T</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >U</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >V</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >X</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Y</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >Z</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >[ </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 48</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >‘</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >a</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >b</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >c</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >d</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >e</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >f</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >h </td></tr>
<tr><td style="text-align:right;border:solid 1px;white-space:nowrap" > 56</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >i</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >j</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >k</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >l</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >m</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >p</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >q</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >r </td></tr>
</table>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Table 5: Encoding Table for BinHex Encoding</td></tr>
</table></div>
<a id="tab-bh"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>A <em>BinHex</em> file is a stream of characters, beginning and ending
with a colon ‘:’; intermediate line breaks are to be ignored by the
decoder. Each line but the last should be exactly 64 characters in
length. The last line may be shorter, and in a special case can also
be 65 characters long. The trailing colon must not stand alone, so if
the input data ends on an output line boundary, the colon is appended
to this line as 65th character. Thus a <em>BinHex</em> begins with a
colon in the first column and ends with a colon <em>not</em> in the
first column.</p><p>The line before the beginning of encoded data (before the initial
‘:’) should contain the following verbatim text:<sup><a id="text15" href="#note15">15</a></sup>
</p><blockquote class="quote">
<pre class="verbatim">(This file must be converted with BinHex 4.0)</pre></blockquote><p>
BinHex is another three-in-four encoding, and not surprisingly,
another different character table is used (table <a href="#tab-bh">5</a>).
The documentation does not explicitly mention what is supposed to
happen if the original input data does not have a multiple of three
octets. But from reading between the lines, it looks like
“unnecessary” characters (those that would result in equal
signs in Base64 encoding) are not printed.</p><blockquote class="table"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
<table border=1 style="border-spacing:0;" class="cellpadding1"><tr><td style="text-align:center;border:solid 1px;white-space:nowrap" colspan=6>Compressed Data</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" colspan=6>Uncompressed Data </td></tr>
<tr><td style="text-align:center;border:solid 1px;white-space:nowrap" >00</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >11</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >33</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >44</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >55</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >↦</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >00</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >11</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >33</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >44</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >55 </td></tr>
<tr><td style="text-align:center;border:solid 1px;white-space:nowrap" >11</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >04</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >33</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td><td style="text-align:center;border:solid 1px;white-space:nowrap" >↦</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >11</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >33 </td></tr>
<tr><td style="text-align:center;border:solid 1px;white-space:nowrap" >11</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >00</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >33</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >44</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >↦</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >11</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >22</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >33</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >44</td><td style="text-align:center;border:solid 1px;white-space:nowrap" > </td></tr>
<tr><td style="text-align:center;border:solid 1px;white-space:nowrap" >2B</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >00</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >04</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >55</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >↦</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >2B</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >90</td><td style="text-align:center;border:solid 1px;white-space:nowrap" >55 </td></tr>
</table>
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Table 6: BinHex RLE decoding</td></tr>
</table></div>
<a id="bh-rle"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>The encoded characters decode into a RLE-compressed bytestream, which
must be handled in the next step (of course, decoding and
decompressing are usually handled at the same time). A Run Length
Encoding simply replaces multiple subsequent occurrences of one octet
are replaced by the character, a special marker, and the repetition
count. BinHex uses the marker <span style="font-family:monospace">0x90</span> (octal <span style="font-family:monospace">0220</span>,
decimal <span style="font-family:monospace">128</span>). The octet sequence <span style="font-family:monospace">0xff</span> <span style="font-family:monospace">0x90</span>
<span style="font-family:monospace">0x04</span> would decompress into four times <span style="font-family:monospace">0xff</span>. If the
marker itself occurs, it must be “escaped” by the special sequence
<span style="font-family:monospace">0x90</span> <span style="font-family:monospace">0x00</span> (the marker with a repetition count of
0). Table <a href="#bh-rle">6</a> shows four more examples. Note the last
example, where the marker itself is repeated.</p><blockquote class="figure"><div class="center"><div class="center"><hr style="width:80%;height:2"></div>
(4454,2057)(1111,-2523)
(1201,-961)<span class="textboxed">(</span>150,300)
(1351,-961)<span class="textboxed">(</span>1050,300)
(2401,-961)<span class="textboxed">(</span>150,300)
(2551,-961)<span class="textboxed">(</span>600,300)
(3151,-961)<span class="textboxed">(</span>600,300)
(4051,-961)<span class="textboxed">(</span>600,300)
(4651,-961)<span class="textboxed">(</span>600,300)
(3751,-961)<span class="textboxed">(</span>300,300)
(5251,-961)<span class="textboxed">(</span>300,300)
(5253,-1713)<span class="textboxed">(</span>300,300)
(5253,-2463)<span class="textboxed">(</span>300,300)
(4501,-1711)(-1, 0)3300
(1201,-1711)( 0, 1)300
(1201,-1411)( 1, 0)3300
(4501,-2461)(-1, 0)3300
(1201,-2461)( 0, 1)300
(1201,-2161)( 1, 0)3300
(4501,-1411)(115.38462,0.00000)7( 1, 0) 57.692
(4501,-1711)(115.38462,0.00000)7( 1, 0) 57.692
(4501,-2161)(115.38462,0.00000)7( 1, 0) 57.692
(4501,-2461)(115.38462,0.00000)7( 1, 0) 57.692
(1276,-886)(0,0)[b]1214.4ptn
(1876,-886)(0,0)[b]1214.4ptName
(2476,-886)(0,0)[b]1214.4pt0
(2851,-886)(0,0)[b]1214.4ptType
(3451,-886)(0,0)[b]1214.4ptAuth
(4351,-886)(0,0)[b]1214.4ptDlen
(4951,-886)(0,0)[b]1214.4ptRlen
(1876,-586)(0,0)[b]1012.0ptn
(2476,-586)(0,0)[b]1012.0pt1
(2851,-586)(0,0)[b]1012.0pt4
(3451,-586)(0,0)[b]1012.0pt4
(3901,-586)(0,0)[b]1012.0pt2
(4351,-586)(0,0)[b]1012.0pt4
(4951,-586)(0,0)[b]1012.0pt4
(5401,-586)(0,0)[b]1012.0pt2
(1126,-736)(0,0)[rb]1214.4ptHeader
(1126,-1006)(0,0)[rb]1214.4ptSection
(1276,-586)(0,0)[b]1012.0pt1
(5401,-886)(0,0)[b]1012.0ptHC
(5402,-1337)(0,0)[b]1012.0pt2
(1128,-1488)(0,0)[rb]1214.4ptData
(1128,-1758)(0,0)[rb]1214.4ptSection
(3228,-1638)(0,0)[b]1214.4ptData Fork
(3228,-1338)(0,0)[b]1012.0ptDlen
(5403,-1638)(0,0)[b]1012.0ptDC
(5402,-2087)(0,0)[b]1012.0pt2
(1128,-2238)(0,0)[rb]1214.4ptResource
(1128,-2508)(0,0)[rb]1214.4ptSection
(3228,-2388)(0,0)[b]1214.4ptResource Fork
(3228,-2088)(0,0)[b]1012.0ptRlen
(5403,-2388)(0,0)[b]1012.0ptRC
(3901,-886)(0,0)[b]1012.0ptFlag
<div class="caption"><table style="border-spacing:6px;border-collapse:separate;" class="cellpading0"><tr><td style="vertical-align:top;text-align:left;" >Figure 5: BinHex file structure</td></tr>
</table></div>
<a id="bh-parts"></a>
<div class="center"><hr style="width:80%;height:2"></div></div></blockquote><p>The decompression results in a data stream which consists of three
parts, the header section, the data fork and the resource fork. Figure
<a href="#bh-parts">5</a> shows how the sections are composed. The numbers above
each item indicate its size in octets. The header has the following
items:
</p><dl class="description"><dt class="dt-description">
<span style="font-weight:bold">n</span></dt><dd class="dd-description"> The length of the filename in octets. This is a single octet,
so the maximum length of a filename is 255.
</dd><dt class="dt-description"><span style="font-weight:bold">Name</span></dt><dd class="dd-description"> The filename, <em>n</em> octets in length. The length does
not include the final nullbyte (which is actually the next
item).<sup><a id="text16" href="#note16">16</a></sup>
</dd><dt class="dt-description"><span style="font-weight:bold">0</span></dt><dd class="dd-description"> This single nullbyte terminates the previous filename.
</dd><dt class="dt-description"><span style="font-weight:bold">Type</span></dt><dd class="dd-description"> The Macintosh file type.
</dd><dt class="dt-description"><span style="font-weight:bold">Auth</span></dt><dd class="dd-description"> The Macintosh “creator”, the program which wrote the
original file. This and the previous item are used to start the right
program to edit or display a file. I have no idea what common values
are.
</dd><dt class="dt-description"><span style="font-weight:bold">Flags</span></dt><dd class="dd-description"> Macintosh file flags. No idea what they are.
</dd><dt class="dt-description"><span style="font-weight:bold">Dlen</span></dt><dd class="dd-description"> The number of octets in the data fork.
</dd><dt class="dt-description"><span style="font-weight:bold">Rlen</span></dt><dd class="dd-description"> The number of octets in the resource fork.
</dd><dt class="dt-description"><span style="font-weight:bold">HC</span></dt><dd class="dd-description"> CRC checksum of the header data.
</dd></dl><p>After the header, at offset <span style="font-style:italic">n</span>+22, follow the <em>Dlen</em> octets of
the data fork and a CRC checksum of the data fork (offset
<span style="font-style:italic">n</span>+<span style="font-style:italic">Dlen</span>+22), then <em>Rlen</em> octets of the resource
fork (offset <span style="font-style:italic">n</span>+<span style="font-style:italic">Dlen</span>+24) and a CRC checksum of the resource fork
(offset <span style="font-style:italic">n</span>+<span style="font-style:italic">Dlen</span>+<span style="font-style:italic">Rlen</span>+24). Note that the CRCs are present even if
the forks are empty.</p><p>The three CRC checksums are calculated as described in the following
text, taken from Peter Lewis’ description:
</p><blockquote class="quote">
BinHex 4.0 uses a 16-bit CRC with a 0x1021 seed. The general algorithm is
to take data 1 bit at a time and process it through the following:
<ol class="enumerate" type=1><li class="li-enumerate">
Take the old CRC (use 0x0000 if there is no previous CRC) and shift it
to the left by 1.
</li><li class="li-enumerate">Put the new data bit in the least significant position (right bit).
</li><li class="li-enumerate">If the bit shifted out in (1) was a 1 then xor the CRC with 0x1021.
</li><li class="li-enumerate">Loop back to (1) until all the data has been processed.
</li></ol>
</blockquote><p>This is the sample file in <em>BinHex</em>. However, the encoder I used
replaced the LF characters from the original file with CR
characters. It probably noticed that the input file was plain text and
reformatted it to Mac-style text, but I consider this a software
bug. The assigned filename is “test.txt”.</p><pre class="verbatim"><span style="font-size:small">(This file must be converted with BinHex 4.0)
:#&4&8e3Z9&K8!&4&@&4dG(Kd!!!!!!#X!!!!!+3j9'KTFb"TFb"K)(4PFh3JCQP
XC5"QEh)JD@aXGA0dFQ&dD@jR)(4SC5"fBA*TEh9c$@9ZBfpND@jR)'ePG'K[C(-
Z)%aPG#Gc)'eKDf8JG'KTFb"dCAKd)'a[EQGPFL"dD'&Z$68h)'*jG'9c)(4[)(G
bBA!JE'PZCA-JGfPdD#"#BA0P0M3JC'&dB5`JG'p[,Je(FQ9PG'PZCh-X)%CbB@j
V)&"TE'K[CQ9b$B0A!!!!:
</span></pre>
<!--TOC subsection id="sec45" Quoted-Printable-->
<h3 id="sec45" class="subsection">A.5  Quoted-Printable</h3><!--SEC END --><p>The <em>Quoted-Printable</em> encoding is, like <em>Base64</em>, part of the
<em>MIME</em> standard, described in [<a href="#rfc1521">RFC1521</a>]. It is not suitable
for encoding arbitrary binary data, but is intended for “data that
largely consists of octets that correspond to printable characters”.
It is widely in use in countries with an extended character set, where
characters like the German umlauts ‘ä’ or ‘ß’ are represented by
non-ASCII characters with the highest bit set.</p><p>The essence of the encoding is that arbitrary octets can be
represented by an equal sign ‘=’ followed by two hexadecimal
digits. The equal sign itself, for example, is encoded as “=3D”.</p><p>Quoted-Printable enforces a maximum line length of 76
characters. Longer lines can be wrapped using soft line breaks. If the
last character of an encoded line is an equal sign, the following line
break is to be ignored.</p><p>It would indeed be possible to transfer arbitrary binary data using
this encoding, but care must be taken with line breaks, which are
converted from native format on the sender’s side and back into native
format on the recipient’s side. However, the native representations
may differ. But this alternative is hardly worth considering, since
for arbitrary data, <em>quoted-printable</em> is substantially less
effective than <em>Base64</em>.</p><p>Please refer to the original document, [<a href="#rfc1521">RFC1521</a>], for a complete
discussion of the encoding.</p><p>Here is how the example file could look like in Quoted-Printable
encoding.</p><pre class="verbatim"><span style="font-size:small">This is a test file for =
illustrating the various
encoding methods=2e=20=
Let=27s make this text=
longer than
=357 bytes to wrap lines =
with Base64 data=2c too=2e
Greetings=2c Frank Pilhofer
</span></pre><!--BEGIN NOTES document-->
<hr class="footnoterule"><dl class="thefootnotes"><dt class="dt-thefootnotes">
<a id="note1" href="#text1">1</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">The
Microsoft compilers offer the <em>QuickWin</em> target to allow
terminal-oriented programs to run in the Windows environment</div></dd><dt class="dt-thefootnotes"><a id="note2" href="#text2">2</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">Actually, most project-oriented systems compile
the project definitions into a Makefile for use by the back-ends.</div></dd><dt class="dt-thefootnotes"><a id="note3" href="#text3">3</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">It is not intended that this and the previous
error levels will ever be used. Currently, there’s no need to include
handling for them.</div></dd><dt class="dt-thefootnotes"><a id="note4" href="#text4">4</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">This
happens if, in a MIME multipart posting, the final boundary cannot be
found. After searching the boundary until the end-of-file, the scanner
resets itself to the location of the previous boundary.</div></dd><dt class="dt-thefootnotes"><a id="note5" href="#text5">5</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">This value should
only appear internally, never to be seen by an application.</div></dd><dt class="dt-thefootnotes"><a id="note6" href="#text6">6</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">Of course, this option wouldn’t make sense
with single-part files, since there’s no “grouping” involved that
might fail.</div></dd><dt class="dt-thefootnotes"><a id="note7" href="#text7">7</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">Strictly
speaking, the memory is of course limited. But try to fill a sensible
amount with structures in the 100-byte region.</div></dd><dt class="dt-thefootnotes"><a id="note8" href="#text8">8</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">If we don’t have permission to overwrite the
target file, an I/O error is generated.</div></dd><dt class="dt-thefootnotes"><a id="note9" href="#text9">9</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">Actually, only the definition of <span style="font-family:monospace">UUEXPORT</span>
is needed. You could omit <span style="font-family:monospace"><config.h></span> and define this value
elsewhere, for example in the project definitions.</div></dd><dt class="dt-thefootnotes"><a id="note10" href="#text10">10</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">With Microsoft compilers on MS-DOS systems,
don’t forget to link with <span style="font-family:monospace">setargv.obj</span> to properly handle
wildcards</div></dd><dt class="dt-thefootnotes"><a id="note11" href="#text11">11</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">The term “octet” is used here instead of “byte”,
since it more accurately reflects the 8-bit nature of what we
usually call a “byte”</div></dd><dt class="dt-thefootnotes"><a id="note12" href="#text12">12</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">… that is not always respected by old
encoders</div></dd><dt class="dt-thefootnotes"><a id="note13" href="#text13">13</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">Of course, encoded files must be split on line
boundaries instead of at a fixed byte count.</div></dd><dt class="dt-thefootnotes"><a id="note14" href="#text14">14</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">Yes, there <em>are</em> files violating this
assumption.</div></dd><dt class="dt-thefootnotes"><a id="note15" href="#text15">15</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">In fact, this
text is <em>required</em> by certain decoding software.</div></dd><dt class="dt-thefootnotes"><a id="note16" href="#text16">16</a></dt><dd class="dd-thefootnotes"><div class="footnotetext">The Filename may contain certain characters that are
invalid on MS-DOS systems, like space characters</div></dd></dl>
<!--END NOTES-->
<!--CUT END -->
<!--HTMLFOOT-->
<!--ENDHTML-->
<!--FOOTER-->
<hr style="height:2"><blockquote class="quote"><em>This document was translated from L<sup>A</sup>T<sub>E</sub>X by
</em><a href="http://hevea.inria.fr/index.html"><em>H</em><em><span style="font-size:small"><sup>E</sup></span></em><em>V</em><em><span style="font-size:small"><sup>E</sup></span></em><em>A</em></a><em>.</em></blockquote></body>
</html>
|