/usr/lib/swi-prolog/doc/Manual/syntax.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

<HTML>
<HEAD>
<TITLE>SWI-Prolog 5.11.18 Reference Manual: Section 2.15</TITLE><LINK REL=home HREF="index.html">
<LINK REL=contents HREF="Contents.html">
<LINK REL=index HREF="DocIndex.html">
<LINK REL=summary HREF="summary.html">
<LINK REL=previous HREF="gc.html">
<LINK REL=next HREF="cyclic.html">
<STYLE type="text/css">
/* Style sheet for SWI-Prolog latex2html
*/

dd.defbody
{ margin-bottom: 1em;
}

dt.pubdef
{ background-color: #c5e1ff;
}

dt.multidef
{ background-color: #c8ffc7;
}

.bib dd
{ margin-bottom: 1em;
}

.bib dt
{ float: left;
margin-right: 1.3ex;
}

pre.code
{ margin-left: 1.5em;
margin-right: 1.5em;
border: 1px dotted;
padding-top: 5px;
padding-left: 5px;
padding-bottom: 5px;
background-color: #f8f8f8;
}

div.navigate
{ text-align: center;
background-color: #f0f0f0;
border: 1px dotted;
padding: 5px;
}

div.title
{ text-align: center;
padding-bottom: 1em;
font-size: 200%;
font-weight: bold;
}

div.author
{ text-align: center;
font-style: italic;
}

div.abstract
{ margin-top: 2em;
background-color: #f0f0f0;
border: 1px dotted;
padding: 5px;
margin-left: 10%; margin-right:10%;
}

div.abstract-title
{ text-align: center;
padding: 5px;
font-size: 120%;
font-weight: bold;
}

div.toc-h1
{ font-size: 200%;
font-weight: bold;
}

div.toc-h2
{ font-size: 120%;
font-weight: bold;
margin-left: 2em;
}

div.toc-h3
{ font-size: 100%;
font-weight: bold;
margin-left: 4em;
}

div.toc-h4
{ font-size: 100%;
margin-left: 6em;
}

span.sec-nr
{
}

span.sec-title
{
}

span.pred-ext
{ font-weight: bold;
}

span.pred-tag
{ float: right;
padding-top: 0.2em;
font-size: 80%;
font-style: italic;
color: #202020;
}

/* Footnotes */

sup.fn { color: blue; text-decoration: underline; }
span.fn-text { display: none; }
sup.fn span {display: none;}
sup:hover span
{ display: block !important;
position: absolute; top: auto; left: auto; width: 80%;
color: #000; background: white;
border: 2px solid;
padding: 5px; margin: 10px; z-index: 100;
font-size: smaller;
}
</STYLE>
</HEAD>
<BODY BGCOLOR="white">
<DIV class="navigate"><A class="nav" href="index.html"><IMG SRC="home.gif" BORDER=0 ALT="Home"></A>
<A class="nav" href="Contents.html"><IMG SRC="index.gif" BORDER=0 ALT="Contents"></A>
<A class="nav" href="DocIndex.html"><IMG SRC="yellow_pages.gif" BORDER=0 ALT="Index"></A>
<A class="nav" href="summary.html"><IMG SRC="info.gif" BORDER=0 ALT="Summary"></A>
<A class="nav" href="gc.html"><IMG SRC="prev.gif" BORDER=0 ALT="Previous"></A>
<A class="nav" href="cyclic.html"><IMG SRC="next.gif" BORDER=0 ALT="Next"></A>
</DIV>

<H2><A NAME="sec:2.15"><SPAN class="sec-nr">2.15</SPAN> <SPAN class="sec-title">Syntax 
Notes</SPAN></A></H2>

<A NAME="sec:syntax"></A>

<P>SWI-Prolog syntax is close to ISO-Prolog standard syntax, which is 
closely compatible with Edinburgh Prolog syntax. A description of this 
syntax can be found in the Prolog books referenced in the introduction. 
Below are some non-standard or non-common constructs that are accepted 
by SWI-Prolog:

<P>
<UL class="latex">
<LI><I><CODE>/* ... /* ... */ ... */</CODE></I><BR>
The <CODE>/* ... */</CODE> comment statement can be nested. This is 
useful if some code with <CODE>/* ... */</CODE> comment statements in it 
should be commented out.
</UL>

<H3><A NAME="sec:2.15.1"><SPAN class="sec-nr">2.15.1</SPAN> <SPAN class="sec-title">ISO 
Syntax Support</SPAN></A></H3>

<A NAME="sec:isosyntax"></A>

<P>SWI-Prolog offers ISO compatible extensions to the Edinburgh syntax.

<H4><A NAME="sec:2.15.1.1"><SPAN class="sec-nr">2.15.1.1</SPAN> <SPAN class="sec-title">Processor 
Character Set</SPAN></A></H4>

<A NAME="sec:processorcharset"></A>

<P><A NAME="idx:ISOLatin1:208"></A><A NAME="idx:characterset:209"></A>The 
processor character set specifies the class of each character used for 
parsing Prolog source text. Character classification is fixed to use 
UCS/Unicode as provided by the C-library <CODE>wchar_t</CODE> based 
primitives. See also <A class="sec" href="widechars.html">section 2.17</A>.

<H4><A NAME="sec:2.15.1.2"><SPAN class="sec-nr">2.15.1.2</SPAN> <SPAN class="sec-title">Character 
Escape Syntax</SPAN></A></H4>

<A NAME="sec:charescapes"></A>

<P>Within quoted atoms (using single quotes: <CODE>'&lt;atom&gt;'</CODE>) 
special characters are represented using escape-sequences. An escape 
sequence is lead in by the backslash (<CODE><CODE>\</CODE></CODE>) 
character. The list of escape sequences is compatible with the ISO 
standard, but contains one extension and the interpretation of 
numerically specified characters is slightly more flexible to improve 
compatibility.

<DL class="latex">
<DT><CODE>\a</CODE></DT>
<DD class="defbody">
Alert character. Normally the ASCII character 7 (beep).
</DD>
<DT><CODE>\b</CODE></DT>
<DD class="defbody">
Backspace character.
</DD>
<DT><CODE>\c</CODE></DT>
<DD class="defbody">
No output. All input characters up to but not including the first 
non-layout character are skipped. This allows for the specification of 
pretty-looking long lines. For compatibility with Quintus Prolog. Not 
supported by ISO. Example:

<PRE class="code">
format('This is a long line that looks better if it was \c
       split across multiple physical lines in the input')
</PRE>

</DD>
<DT><CODE>\&lt;<VAR><font size=-1>RETURN</font></VAR>&gt;</CODE></DT>
<DD class="defbody">
No output. Skips input to the next non-layout character or to the end of 
the next line. ISO demands skipping only the newline. We advice to use <CODE>\c</CODE> 
or put the layout <EM>before</EM> the <CODE><CODE>\</CODE></CODE>, as 
shown below. Using <CODE>\c</CODE> is supported by various other Prolog 
implementations and will remain supported by SWI-Prolog. The style shown 
below is the most compatible solution.<SUP class="fn">7<SPAN class="fn-text">Future 
versions are likely to interpret <CODE><CODE>\</CODE></CODE>&lt;<VAR>return</VAR>&gt; 
according to ISO.</SPAN></SUP>

<PRE class="code">
format('This is a long line that looks better if it was \
split across multiple physical lines in the input')
</PRE>

<P>instead of

<PRE class="code">
format('This is a long line that looks better if it was\
 split across multiple physical lines in the input')
</PRE>

</DD>
<DT><CODE>\e</CODE></DT>
<DD class="defbody">
Escape character (<font size=-1>ASCII</font> 27).
</DD>
<DT><CODE>\f</CODE></DT>
<DD class="defbody">
Form-feed character.
</DD>
<DT><CODE>\n</CODE></DT>
<DD class="defbody">
Next-line character.
</DD>
<DT><CODE>\r</CODE></DT>
<DD class="defbody">
Carriage-return only (i.e., go back to the start of the line).
</DD>
<DT><CODE>\t</CODE></DT>
<DD class="defbody">
Horizontal tab-character.
</DD>
<DT><CODE>\v</CODE></DT>
<DD class="defbody">
Vertical tab-character (<font size=-1>ASCII</font> 11).
</DD>
<DT><CODE>\<CODE>xXX..\</CODE></CODE></DT>
<DD class="defbody">
Hexadecimal specification of a character. The closing <CODE>\</CODE> is 
obligatory according to the ISO standard, but optional in SWI-Prolog to 
enhance compatibility with the older Edinburgh standard. The code
<CODE>\xa\3</CODE> emits the character 10 (hexadecimal `a') followed by 
`3'. Characters specified this way are interpreted as Unicode 
characters. See also <CODE>\u</CODE>.
</DD>
<DT><CODE>\uXXXX</CODE></DT>
<DD class="defbody">
Unicode character specification where the character is specified using
<EM>exactly</EM> 4 hexadecimal digits. This is an extension to the ISO 
standard fixing two problems. First of all, where <CODE>\x</CODE> 
defines a numeric character code, it doesn't specify the character set 
in which the character should be interpreted. Second, it is not needed 
to use the idiosyncratic closing <CODE><CODE>\</CODE></CODE> ISO Prolog 
syntax.
</DD>
<DT><CODE>\UXXXXXXXX</CODE></DT>
<DD class="defbody">
Same as <CODE>\uXXXX</CODE>, but using 8 digits to cover the whole 
Unicode set.
</DD>
<DT><CODE>\40</CODE></DT>
<DD class="defbody">
Octal character specification. The rules and remarks for hexadecimal 
specifications apply to octal specifications as well.
</DD>
<DT><CODE>\&lt;<VAR>character</VAR>&gt;</CODE></DT>
<DD class="defbody">
Any character immediately preceded by a <CODE><CODE>\</CODE></CODE> and 
not covered by the above escape sequences is copied verbatim. Thus, <CODE>'\\'</CODE> 
is an atom consisting of a single <CODE><CODE>\</CODE></CODE> and <CODE>'\''</CODE> 
and <CODE>''''</CODE> both describe the atom with a single&nbsp;<CODE>'</CODE>.
</DD>
</DL>

<P>Character escaping is only available if the
<CODE>current_prolog_flag(character_escapes, true)</CODE> is active 
(default). See <A NAME="idx:currentprologflag2:210"></A><A class="pred" href="flags.html#current_prolog_flag/2">current_prolog_flag/2</A>. 
Character escapes conflict with <A NAME="idx:writef2:211"></A><A class="pred" href="format.html#writef/2">writef/2</A> 
in two ways: <CODE>\40</CODE> is interpreted as decimal 40 by <A NAME="idx:writef2:212"></A><A class="pred" href="format.html#writef/2">writef/2</A>, 
but character escapes handling by read has already interpreted as 32 (40 
octal). Also, <CODE>\l</CODE> is translated to a single `l'. It is 
advised to use the more widely supported <A NAME="idx:format23:213"></A><A class="pred" href="format.html#format/2">format/[2,3]</A> 
predicate instead. If you insist upon using <A NAME="idx:writef2:214"></A><A class="pred" href="format.html#writef/2">writef/2</A>, 
either switch <A class="flag" href="flags.html#flag:character_escapes">character_escapes</A> 
to
<CODE>false</CODE>, or use double <CODE>\\</CODE>, as in <CODE>writef('\\l')</CODE>.

<H4><A NAME="sec:2.15.1.3"><SPAN class="sec-nr">2.15.1.3</SPAN> <SPAN class="sec-title">Syntax 
for non-decimal numbers</SPAN></A></H4>

<A NAME="sec:nondecsyntax"></A>

<P>SWI-Prolog implements both Edinburgh and ISO representations for 
non-decimal numbers. According to Edinburgh syntax, such numbers are 
written as <CODE>&lt;<VAR>radix</VAR>&gt;'&lt;number&gt;</CODE>, where &lt;<VAR>radix</VAR>&gt; 
is a number between 2 and 36. ISO defines binary, octal and hexadecimal 
numbers using
<CODE>0<EM>[bxo]</EM>&lt;<VAR>number</VAR>&gt;</CODE>. For example: <CODE>A is 0b100 \/ 0xf00</CODE> 
is a valid expression. Such numbers are always unsigned.

<H4><A NAME="sec:2.15.1.4"><SPAN class="sec-nr">2.15.1.4</SPAN> <SPAN class="sec-title">Unicode 
Prolog source</SPAN></A></H4>

<A NAME="sec:unicodesyntax"></A>

<P>The ISO standard specifies the Prolog syntax in ASCII characters. As 
SWI-Prolog supports Unicode in source files we must extend the syntax. 
This section describes the implication for the source files, while 
writing international source files is described in <A class="sec" href="projectfiles.html">section 
3.1.3</A>.

<P>The SWI-Prolog Unicode character classification is based on version 
6.0.0 of the Unicode standard. Please note that <A NAME="idx:chartype2:215"></A><A class="pred" href="chartype.html#char_type/2">char_type/2</A> 
and friends, intended to be used with all text except Prolog source code 
is based on the C-library locale-based classification routines.

<P>
<UL class="latex">
<LI><I>Quoted atoms and strings</I><BR>
Any character of any script can be used in quoted atoms and strings. The 
escape sequences <CODE>\uXXXX</CODE> and <CODE>\UXXXXXXXX</CODE> (see
<A class="sec" href="syntax.html">section 2.15.1.2</A>) were introduced 
to specify Unicode code points in ASCII files.

<P>
<LI><I>Atoms and Variables</I><BR>
We handle them in one item as they are closely related. The Unicode 
standard defines a syntax for identifiers in computer languages.<SUP class="fn">8<SPAN class="fn-text"><A class="url" href="http://www.unicode.org/reports/tr31/">http://www.unicode.org/reports/tr31/</A></SPAN></SUP> 
In this syntax identifiers start with <CODE>ID_Start</CODE> followed by 
a sequence of <CODE>ID_Continue</CODE> codes. Such sequences are handled 
as a single token in SWI-Prolog. The token is a <EM>variable</EM> iff it 
starts with an uppercase character or an underscore (<CODE>_</CODE>). 
Otherwise it is an atom. Note that many languages do not have the notion 
of character-case. In such languages variables <EM>must</EM> be written 
as
<CODE>_name</CODE>.

<P>
<LI><I>White space</I><BR>
All characters marked as separators (Z*) in the Unicode tables are 
handled as layout characters.

<P>
<LI><I>Control and unassigned characters</I><BR>
Control and unassigned (C*) characters produce a syntax error if 
encountered outside quoted atoms/strings and outside comments.

<P>
<LI><I>Other characters</I><BR>
The first 128 characters follow the ISO Prolog standard. Unicode symbol 
and punctuation characters (general category S* and P*) act as glueing 
symbol characters (i.e., just like <CODE><CODE>==</CODE></CODE>: an 
unquoted sequence of symbol characters are combined into an atom).

<P>Other characters (this is mainly <CODE>No</CODE>: <I>a numeric 
character of other type</I>) are currently handled as `solo'.
</UL>

<H4><A NAME="sec:2.15.1.5"><SPAN class="sec-nr">2.15.1.5</SPAN> <SPAN class="sec-title">Singleton 
variable checking</SPAN></A></H4>

<A NAME="sec:singleton"></A>

<P><A NAME="idx:singletonvariable:216"></A><A NAME="idx:anonymousvariable:217"></A>A <EM>singleton 
variable</EM> is a variable that appears only one time in a clause. It 
can always be replaced by <CODE>_</CODE>, the
<EM>anonymous</EM> variable. In some cases however people prefer to give 
the variable a name. As mistyping a variable is a common mistake, Prolog 
systems generally give a warning (controlled by <A NAME="idx:stylecheck1:218"></A><A class="pred" href="debugger.html#style_check/1">style_check/1</A>) 
if a variable is used only once. The system can be informed a variable 
is known to appear once by <EM>starting</EM> it with an underscore. E.g. <CODE>_Name</CODE>. 
Please note that any variable, except plain <CODE>_</CODE> shares with 
variables of the same name. The term <CODE>t(_X, _X)</CODE> is 
equivalent to <CODE>t(X, X)</CODE>, which is <EM>different</EM> from
<CODE>t(_, _)</CODE>.

<P>As Unicode requires variables to start with an underscore in many 
languages this schema needs to be extended.<SUP class="fn">9<SPAN class="fn-text">After 
a proposal by Richard O'Keefe.</SPAN></SUP> First we define the two 
classes of named variables.

<P>
<UL class="latex">
<LI><I>Named singleton variables</I><BR>
Named singletons start with a double underscore (<CODE>__</CODE>) or a 
single underscore followed by an uppercase letter. E.g. <CODE>__var</CODE> 
or
<CODE>_Var</CODE>.

<P>
<LI><I>Normal variables</I><BR>
All other variables are `normal' variables. Note this makes <CODE>_var</CODE> 
a normal variable.<SUP class="fn">10<SPAN class="fn-text">Some Prolog 
dialects write variables this way.</SPAN></SUP>
</UL>

<P>Any normal variable appearing exactly once in the clause <EM>and</EM> 
any named singleton variables appearing more than once are reported. 
Below are some examples with warnings in the right column. Singleton 
messages can be suppressed using the <A NAME="idx:stylecheck1:219"></A><A class="pred" href="debugger.html#style_check/1">style_check/1</A> 
directive.

<P>
<CENTER>
<TABLE BORDER=2 FRAME=box RULES=groups>
<TR VALIGN=top><TD>test(_).</TD></TR>
<TR VALIGN=top><TD>test(_a).</TD><TD>Singleton variables: [_a] </TD></TR>
<TR VALIGN=top><TD>test(_12).</TD><TD>Singleton variables: [_12] </TD></TR>
<TR VALIGN=top><TD>test(A).</TD><TD>Singleton variables: [A] </TD></TR>
<TR VALIGN=top><TD>test(_A).</TD></TR>
<TR VALIGN=top><TD>test(__a).</TD></TR>
<TR VALIGN=top><TD>test(_, _).</TD></TR>
<TR VALIGN=top><TD>test(_a, _a).</TD></TR>
<TR VALIGN=top><TD>test(__a, __a).</TD><TD>Singleton-marked variables 
appearing more than once: [__a] </TD></TR>
<TR VALIGN=top><TD>test(_A, _A).</TD><TD>Singleton-marked variables 
appearing more than once: [_A] </TD></TR>
<TR VALIGN=top><TD>test(A, A).</TD></TR>
</TABLE>

</CENTER>

<P></BODY></HTML>
swi-prolog-nox 5.10.4-3ubuntu1 / usr / lib / swi-prolog / doc / Manual / syntax.html