This file is indexed.

/usr/share/doc/libxml-perl/sax-2.0-adv.html is in libxml-perl 0.08-2.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
<!-- $Id: sax-2.0-adv.html,v 1.6 2001/11/09 18:19:48 darobin Exp $ -->
<html>
  <head>
    <title>Advanced Features of the Perl SAX 2.0 Binding</title>
    <meta name="keywords" content="XML SGML SAX Perl libxml libxml-perl" />
  </head>
  <body>

<h1>Advanced SAX</h1>

<p>The classes, methods, and features described below are
not commonly used in most applications and can be ignored by most
users. If however you find that you are not getting the granularity
you expect from Basic SAX, this would be the place to look for more.
Advanced SAX isn't advanced in the sense that it is harder, or requires
better programming skills. It is simply more complete, and has been
separated to keep Basic SAX simple in terms of the number of events
one would have to deal with.
</p>

<h2><a name="Parsers">SAX Parsers</a></h2>

<p>SAX supports several classes of event handlers: content handlers,
declaration handlers, DTD handlers, error handlers, entity resolvers,
and other extensions.  For each class of events, a seperate handler
can be used to handle those events.  If a handler is not defined for a
class of events, then the default handler, <tt>Handler</tt>, is used.
Each of these handlers is described in the sections below.
Applications may change an event handler in the middle of the parse
and the SAX parser will begin using the new handler immediately.</p>

<p>SAX's basic interface defines methods for parsing system
identifiers (URIs), open files, and strings.  Behind the scenes,
though, SAX uses a <tt>Source</tt> hash that contains that
information, plus encoding, system and public identifiers if
available.  These are described below under the <tt>Source</tt>
option.</p>

<p>SAX parsers accept all features as options to the <tt>parse()</tt>
methods and on the parser's constructor.  Features are described in
the next section.</p>

<p>
<dl><dt><b><tt class='function'>parse</tt></b>(<var>options</var>)</dt>
<dd>
Parses the XML instance identified by the <tt>Source</tt> option.
<var>options</var> can be a list of option, value pairs or a hash.
<tt>parse()</tt> returns the result of calling the
<tt>end_document()</tt> handler.</dd></dl></p>

<p>
<dl><dt><b><tt>ContentHandler</tt></b></dt>
<dd>
Object to receive document content events.  The
<tt>ContentHandler</tt>, with additional events defined below, is the
class of events described in <a href="sax-2.0.html#BasicHandler">Basic
SAX Handler</a>.If the application does not register a content handler
or content event handlers on the default handler, content events
reported by the SAX parser will be silently ignored.</dd></dl></p>

<p>
<dl><dt><b><tt>DTDHandler</tt></b></dt>
<dd>
Object to receive basic DTD events.  If the application does not
register a DTD handler or DTD event handlers on the default handler,
DTD events reported by the SAX parser will be silently
ignored.</dd></dl></p>

<p>
<dl><dt><b><tt>EntityResolver</tt></b></dt>
<dd>
Object to resolve external entities.  If the application does not
register an entity resolver or entity events on the default handler,
the SAX parser will perform its own default resolution.</dd></dl></p>

<p>
<dl><dt><b><tt>ErrorHandler</tt></b></dt>
<dd>
Object to receive error-message events.  If the application does not
register an error handler or error event handlers on the default
handler, all error events reported by the SAX parser will be silently
ignored; however, normal processing may not continue. It is highly
recommended that all SAX applications implement an error handler to
avoid unexpected bugs.</dd></dl></p>

<p>
<dl><dt><b><tt>Source</tt></b></dt>
<dd>
A hash containing information about the XML instance to be parsed.
See <a href="#InputSources">Input Sources</a> below. Note that
<tt>Source</tt> cannot be changed during the parse</dd></dl></p>

<p>
	<dl>
		<dt><strong><tt>Features</tt></strong></dt>
		<dd>
			A hash containing Feature information, as described below.
			Features can be set at runtime but not directly on the Features
			hash (at least, not reliably. You can do it, but the results
			might not be what you expect as it doesn't give the parser a
			chance to look at what you've set so that it can't react properly
			to errors, or Features that it doesn't support). You should use
			the <code>set_feature()</code> method instead.
		</dd>
	</dl>
</p>


<h2><a name="Features">Features</a></h2>

<p>Features are as defined in <a
href="http://sax.sourceforge.net/apidoc/org/xml/sax/package-summary.html#package_description">SAX2: Features
and Properties</a>, but not of course limited to those. You may add
your own Features. Also, Java has an artificial distinction between
Features and Properties which is unnecessary. In Perl, both have been
merged under the same name.
</p>

<p>Features can be passed as options when creating a parser or calling
a <tt>parse()</tt> method. They may also be set using the
set_feature().
</p>

<pre>
    $parser = AnySAXParser->new(
                                Features => {
                                             'http://xml.org/sax/features/namespaces' => 0,
                                             },
                                );
    $parser->parse(
                   Features => {
                               'http://xml.org/sax/features/namespaces' => 0,
                               },
                   );
    $parser->set_feature('http://xml.org/sax/properties/xml-string', 1);
    $string = $parser->get_feature('http://xml.org/sax/properties/xml-string');
</pre>

<p>
	When performing namespace processing, Perl SAX parsers always provide
	both the raw tag name in <tt>Name</tt> and the namespace names in
	<tt>NamespaceURI</tt>, <tt>LocalName</tt>, and <tt>Prefix</tt>.
	Therefore, the
	"<tt>http://xml.org/sax/features/namespace-prefixes</tt>" Feature is
	ignored.
</p>

<p>
	Also, Features are things that are supposed to be <strong>turned
	on</strong>, and thus should normally be off by default, especially if
	the parser doesn't support turning them off. Due to backwards
	compatibility problems, the one exception to this rule is the
	"<tt>http://xml.org/sax/features/namespaces</tt>" Feature which is on by
	default and which a number of parsers may not be able to turn off. Thus,
	a parser claiming to support this Feature (and all SAX2 parsers must
	support	it) may in fact only support turning it on. This is only a minor
	problem as turning it off basically amounts to returning to SAX1, which
	can be accomplished by a filter (eg XML::Filter::SAX2toSAX1).
</p>

<p>
  In addition to the Features described in the SAX spec
  itself, a number of new ones may be defined for Perl. An example of
  this would be http://xmlns.perl.org/sax/node-factory which
  when supported by the parser would be settable to a NodeFactory object
  that would be in charge of creating SAX nodes different from those that
  are normally received by event handlers. See
  <a href='http://xmlns.perl.org/'>http://xmlns.perl.org/</a> (currently
  in alpha state) for details on how to register Features.
</p>

<p>
	The following methods are used to get and set features:
</p>

<p>
<dl><dt><b><tt class='function'>get_feature</tt></b>(<var>name</var>)</dt>
<dd>
Look up the value of a feature.

<p>The feature name is any fully-qualified URI.  It is possible for an
SAX parser to recognize a feature name but to be unable to return its
value; this is especially true in the case of an adapter for a SAX1
Parser, which has no way of knowing whether the underlying parser is
validating, for example.</p>

<p>Some feature values may be available only in specific contexts,
such as before, during, or after a parse.</p>

<tt>get_feature()</tt> returns the value of the feature, which is usually
either a boolean or an object, and will throw
<tt>XML::SAX::Exception::NotRecognized</tt> when the SAX parser does not
recognize the feature name and <tt>XML::SAX::Exception::NotSupported</tt>
when the SAX parser recognizes the feature name but cannot determine its
value at this time.</dd></dl></p>

<p>
<dl><dt><b><tt class='function'>set_feature</tt></b>(<var>name</var>,
<var>value</var>)</dt>
<dd>
Set the state of a feature.

<p>The feature name is any fully-qualified URI. It is possible for an
SAX parser to recognize a feature name but to be unable to set its
value; this is especially true in the case of an adapter for a SAX1
Parser, which has no way of affecting whether the underlying parser is
validating, for example.</p>

<p>Some feature values may be immutable or mutable only in specific
contexts, such as before, during, or after a parse.</p>

<tt>set_feature()</tt> will throw <tt>XML::SAX::Exception::NotRecognized</tt>
when the SAX parser does not recognize the feature name and
<tt>XML::SAX::Exception::NotSupported</tt> when the SAX parser recognizes the
feature name but cannot set the requested value.

<p>
	This method is also the standard mechanism for setting extended	handlers,
	such as "<code>http://xml.org/sax/handlers/DeclHandler</code>".
</p>
</dd></dl></p>


<p>
	<dl>
		<dt><strong><code class='function'>get_features</code></strong>()</dt>
		<dd>
			Look up all Features that this parser claims to support.
			<p>
				This method returns a hash of Features which the parser
				claims to support. The value of the hash is currently
				unspecified though it may be used later. This method is meant
				to be inherited so that Features supported by the base parser
				class (XML::SAX::Base) are declared to be supported by
				subclasses.
			</p>
			<p>
				Calling this method is probably only moderately useful to end
				users. It is mostly meant for use by XML::SAX, so that it can
				query parsers for Feature support and return an appropriate
				parser depending on the Features that are required.
			</p>
		</dd>
	</dl>
</p>



<h2><a name="InputSources">Input Sources</a></h2>

<p>Input sources may be provided to parser objects or are returned by
entity resolvers.  An input source is a hash with these
properties:</p>

<dl>
<dt><b><tt>PublicId</tt></b></dt>
<dd>The public identifier of this input source.

<p>The public identifier is always optional: if the application writer
includes one, it will be provided as part of the location
information.</p></dd>

<dt><b><tt>SystemId</tt></b></dt>
<dd>The system identifier (URI) of this input source.

<p>The system identifier is optional if there is a byte stream or a
character stream, but it is still useful to provide one, since the
application can use it to resolve relative URIs and can include it in
error messages and warnings (the parser will attempt to open a
connection to the URI only if there is no byte stream or character
stream specified).</p>

If the application knows the character encoding of the object
pointed to by the system identifier, it can register the encoding
using the <tt>Encoding</tt> property.</dd>

<dt><b><tt>ByteStream</tt></b></dt>
<dd>The byte stream for this input source.

<p>The SAX parser will ignore this if there is also a character stream
specified, but it will use a byte stream in preference to opening a
URI connection itself.</p>

If the application knows the character encoding of the byte stream, it
should set the <tt>Encoding</tt> property.</dd>

<dt><b><tt>CharacterStream</tt></b></dt>
<dd>The character stream for this input source.

<p>If there is a character stream specified, the SAX parser will
ignore any byte stream and will not attempt to open a URI connection
to the system identifier.</p>

<p>Note: A CharacterStream is a filehandle that does not need any encoding
translation done on it. This is implemented as a regular filehandle
and only works under Perl 5.7.2 or higher using PerlIO. To get a single
character, or number of characters from it, use the perl core read()
function. To get a single byte from it (or number of bytes), you can
use sysread(). The encoding of the stream should be in the Encoding
entry for the Source.</p>

</dd>

<dt><b><tt>Encoding</tt></b></dt>
<dd>The character encoding, if known.

<p>The encoding must be a string acceptable for an XML encoding
declaration (see section 4.3.3 of the XML 1.0 recommendation).</p>

This property has no effect when the application provides a character
stream.</dd>
</dl>

<h2><a name="Handlers">SAX Handlers</a></h2>

<p>SAX supports several classes of event handlers: content handlers,
declaration handlers, DTD handlers, error handlers, entity resolvers,
and other extensions.  This section defines each of these classes of
events.</p>

<h3>Content Events</h3>

<p>This is the main interface that most SAX applications implement: if
the application needs to be informed of basic parsing events, it
implements this interface and registers an instance with the SAX
parser using the <tt>ContentHandler</tt> property. The parser uses
the instance to report basic document-related events like the start
and end of elements and character data.</p>

<p>The order of events in this interface is very important, and
mirrors the order of information in the document itself. For example,
all of an element's content (character data, processing instructions,
and/or subelements) will appear, in order, between the
<tt>start_element</tt> event and the corresponding
<tt>end_element</tt> event.</p>


<p>
<dl><dt><b><tt class='function'>set_document_locator</tt></b>(<var>locator</var>)</dt>
<dd>
Receive an object for locating the origin of SAX document events.

<p>SAX parsers are strongly encouraged (though not absolutely
required) to supply a locator: if it does so, it must supply the
locator to the application by invoking this method before invoking any
of the other methods in the ContentHandler interface.</p>

<p>The locator allows the application to determine the end position of
any document-related event, even if the parser is not reporting an
error.  Typically, the application will use this information for
reporting its own errors (such as character content that does not
match an application's business rules).  The information provided by
the locator is probably not sufficient for use with a search
engine.</p>

<p>Note that the locator will provide correct information only during
the invocation of the events in this interface. The application should
not attempt to use it at any other time.</p>

<p>The locator is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>ColumnNumber</tt></b></td>
<td>The column number of the end of the text where the exception
occurred.</td></tr>
<tr><td><b><tt>LineNumber</tt></b></td>
<td>The line number of the end of the text where the exception
occurred.</td></tr>
<tr><td><b><tt>PublicId</tt></b></td>
<td>The public identifier of the entity where the exception
occurred.</td></tr>
<tr><td><b><tt>SystemId</tt></b></td>
<td>The system identifier of the entity where the exception
occurred.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>start_prefix_mapping</tt></b>(<var>mapping</var>)</dt>
<dd>
Begin the scope of a prefix-URI Namespace mapping.

<p>The information from this event is not necessary for normal
Namespace processing: the SAX XML reader will automatically replace
prefixes for element and attribute names when the
"<tt>http://xml.org/sax/features/namespaces</tt>" feature is true (the
default).</p>

<p>There are cases, however, when applications need to use prefixes in
character data or in attribute values, where they cannot safely be
expanded automatically; the start/end_prefix_mapping event supplies the
information to the application to expand prefixes in those contexts
itself, if necessary.</p>

<p>Note that <tt>start</tt>/<tt>end_prefix_mapping()</tt> events are
not guaranteed to be properly nested relative to each-other: all
<tt>start_prefix_apping()</tt> events will occur before the
corresponding <tt>start_element()</tt> event, and all
<tt>end_prefix_mapping</tt> events will occur after the corresponding
<tt>end_element()</tt> event, but their order is not
guaranteed.
</p>

<p><var>mapping</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Prefix</tt></b></td>
<td>The Namespace prefix being declared.</td></tr>
<tr><td><b><tt>NamespaceURI</tt></b></td>
<td>The Namespace URI the prefix is mapped to.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>end_prefix_mapping</tt></b>(<var>mapping</var>)</dt>
<dd>
End the scope of a prefix-URI mapping.

<p>See <tt>start_prefix_mapping()</tt> for details. This event will
always occur after the corresponding <tt>end_element</tt> event, but
the order of <tt>end_prefix_mapping</tt> events is not otherwise
guaranteed.</p>

<p><var>mapping</var> is a hash with this property:</p>

<blockquote>
<table>
<tr><td><b><tt>Prefix</tt></b></td>
<td>The Namespace prefix that was being mapped.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>processing_instruction</tt></b>(<var>pi</var>)</dt>
<dd>
Receive notification of a processing instruction.

<p>The Parser will invoke this method once for each processing
instruction found: note that processing instructions may occur before
or after the main document element.</p>

<p>A SAX parser should never report an XML declaration (XML 1.0,
section 2.8) or a text declaration (XML 1.0, section 4.3.1) using this
method.</p>

<p><var>pi</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Target</tt></b></td>
<td>The processing instruction target.</td></tr>
<tr><td><b><tt>Data</tt></b></td>
<td>The processing instruction data, or null if none was
supplied.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>skipped_entity</tt></b>(<var>entity</var>)</dt>
<dd>
Receive notification of a skipped entity.

<p>The Parser will invoke this method once for each entity skipped.
Non-validating processors may skip entities if they have not seen the
declarations (because, for example, the entity was declared in an
external DTD subset). All processors may skip external entities,
depending on the values of the
"<tt>http://xml.org/sax/features/external-general-entities</tt>" and the
"<tt>http://xml.org/sax/features/external-parameter-entities</tt>"
Features.</p>

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The name of the skipped entity. If it is a parameter
entity, the name will begin with '<tt>%</tt>'.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<h3>Declaration Events</h3>

<p>This is an optional extension handler for SAX2 to provide
information about DTD declarations in an XML document. XML readers are
not required to support this handler.</p>

<p>Note that data-related DTD declarations (unparsed entities and
notations) are already reported through the DTDHandler interface.</p>

<p>If you are using the declaration handler together with a lexical
handler, all of the events will occur between the <tt>start_dtd</tt>
and the <tt>end_dtd</tt> events.</p>

<p>To set a seperate DeclHandler for an XML reader, set the
"<tt>http://xml.org/sax/handlers/DeclHandler</tt>" Feature with the
object to received declaration events.  If the reader does not support
declaration events, it will throw a <tt>XML::SAX::Exception::NotRecognized</tt>
or a <tt>XML::SAX::Exception::NotSupported</tt> when you attempt to register
the handler.  Declaration event handlers on the default handler are
automatically recognized and used.</p>


<p>
<dl><dt><b><tt class='function'>element_decl</tt></b>(<var>element</var>)</dt>
<dd>
Report an element type declaration.

<p>The content model will consist of the string "EMPTY", the string
"ANY", or a parenthesised group, optionally followed by an occurrence
indicator. The model will be normalized so that all whitespace is
removed, and will include the enclosing parentheses.</p>

<p><var>element</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The element type name.</td></tr>
<tr><td><b><tt>Model</tt></b></td>
<td>The content model as a normalized string.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>attribute_decl</tt></b>(<var>attribute</var>)</dt>
<dd>
Report an attribute type declaration.

<p>Only the effective (first) declaration for an attribute will be
reported.  The type will be one of the strings "<tt>CDATA</tt>",
"<tt>ID</tt>", "<tt>IDREF</tt>", "<tt>IDREFS</tt>",
"<tt>NMTOKEN</tt>", "<tt>NMTOKENS</tt>", "<tt>ENTITY</tt>",
"<tt>ENTITIES</tt>", or "<tt>NOTATION</tt>", or a parenthesized token
group with the separator "<tt>|</tt>" and all whitespace removed.</p>

<p><var>attribute</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>eName</tt></b></td>
<td>The name of the associated element.</td></tr>
<tr><td><b><tt>aName</tt></b></td>
<td>The name of the attribute.</td></tr>
<tr><td><b><tt>Type</tt></b></td>
<td>A string representing the attribute type.</td></tr>
<tr><td><b><tt>ValueDefault</tt></b></td>
<td>A string representing the attribute default ("<tt>#IMPLIED</tt>",
"<tt>#REQUIRED</tt>", or "<tt>#FIXED</tt>") or undef if none of these
applies.</td></tr>
<tr><td><b><tt>Value</tt></b></td>
<td>A string representing the attribute's default value, or null if
there is none.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>internal_entity_decl</tt></b>(<var>entity</var>)</dt>
<dd>
Report an internal entity declaration.

<p>Only the effective (first) declaration for each entity will be
reported.</p>

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The name of the entity. If it is a parameter entity, the name will
begin with '%'.</td></tr>
<tr><td><b><tt>Value</tt></b></td>
<td>The replacement text of the entity.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>external_entity_decl</tt></b>(<var>entity</var>)</dt>
<dd>
Report a parsed external entity declaration.

<p>Only the effective (first) declaration for each entity will be
reported.</p>

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The name of the entity. If it is a parameter entity, the name will
begin with '%'.</td></tr>
<tr><td><b><tt>PublicId</tt></b></td>
<td>The public identifier of the entity, or <tt>undef</tt> if none was
declared.</td></tr>
<tr><td><b><tt>SystemId</tt></b></td>
<td>The system identifier of the entity.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<h3>DTD Events</h3>

<p>If a SAX application needs information about notations and unparsed
entities, then the application implements this interface.  The parser
uses the instance to report notation and unparsed entity declarations
to the application.</p>

<p>The SAX parser may report these events in any order, regardless of
the order in which the notations and unparsed entities were declared;
however, all DTD events must be reported after the document handler's
<tt>start_document()</tt> event, and before the first
<tt>start_element()</tt> event.</p>

<p>It is up to the application to store the information for future use
(perhaps in a hash table or object tree). If the application
encounters attributes of type "<tt>NOTATION</tt>", "<tt>ENTITY</tt>",
or "<tt>ENTITIES</tt>", it can use the information that it obtained
through this interface to find the entity and/or notation
corresponding with the attribute value.</p>

<p>
<dl><dt><b><tt class='function'>notation_decl</tt></b>(<var>notation</var>)</dt>
<dd>
Receive notification of a notation declaration event.

<p>It is up to the application to record the notation for later
reference, if necessary.</p>

<p>If a system identifier is present, and it is a URL, the SAX parser
must resolve it fully before passing it to the application.</p>

<p><var>notation</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The notation name.</td></tr>
<tr><td><b><tt>PublicId</tt></b></td>
<td>The public identifier of the entity, or <tt>undef</tt> if none was
declared.</td></tr>
<tr><td><b><tt>SystemId</tt></b></td>
<td>The system identifier of the entity, or <tt>undef</tt> if none was
declared.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>unparsed_entity_decl</tt></b>(<var>entity</var>)</dt>
<dd>
Receive notification of an unparsed entity declaration event.

<p>Note that the notation name corresponds to a notation reported by
the <tt>notation_decl()</tt> event. It is up to the application to
record the entity for later reference, if necessary.</p>

<p>If the system identifier is a URL, the parser must resolve it fully
before passing it to the application.</p>

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The unparsed entity's name.</td></tr>
<tr><td><b><tt>PublicId</tt></b></td>
<td>The public identifier of the entity, or <tt>undef</tt> if none was
declared.</td></tr>
<tr><td><b><tt>SystemId</tt></b></td>
<td>The system identifier of the entity.</td></tr>
<tr><td><b><tt>Notation</tt></b></td>
<td>The name of the associated notation.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<h3>Entity Resolver</h3>

<p>If a SAX application needs to implement customized handling for
external entities, it must implement this interface.</p>

<p>The parser will then allow the application to intercept any
external entities (including the external DTD subset and external
parameter entities, if any) before including them.</p>

<p>
  Many SAX applications will not need to implement this interface,
  but it will be especially useful for applications that build XML
  documents from databases or other specialised input sources, or for
  applications that use URI types that are either not URLs, or that
  have schemes unknown to the parser.
</p>

<p>
<dl><dt><b><tt class='function'>resolve_entity</tt></b>(<var>entity</var>)</dt>
<dd>
Allow the application to resolve external entities.

<p>The Parser will call this method before opening any external entity
except the top-level document entity (including the external DTD
subset, external entities referenced within the DTD, and external
entities referenced within the document element): the application may
request that the parser resolve the entity itself, that it use an
alternative URI, or that it use an entirely different input
source.</p>

<p>Application writers can use this method to redirect external system
identifiers to secure and/or local URIs, to look up public identifiers
in a catalogue, or to read an entity from a database or other input
source (including, for example, a dialog box).</p>

<p>If the system identifier is a URL, the SAX parser must resolve it
fully before reporting it to the application.</p>

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>PublicId</tt></b></td>
<td>The public identifier of the entity being referenced, or
<tt>undef</tt> if none was declared.</td></tr>
<tr><td><b><tt>SystemId</tt></b></td>
<td>The system identifier of the entity being referenced.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<h3>Error Events</h3>

<p>If a SAX application needs to implement customized error handling,
it must implement this interface.  The parser will then report all
errors and warnings through this interface.</p>

<p>The parser shall use this interface to report errors instead or in
addition to throwing an exception: for errors and warnings the recommended
approach is to leave the application throw its own exceptions and to not
throw them in the parser. For fatal errors however, it is not uncommon that
the parser will throw an exception after having reported the error as it
renders any continuation of parsing impossible.
</p>

<p>All error handlers receive a hash, <var>exception</var>, with the
properties defined in <a
href="sax-2.0.html#Exceptions">Exceptions</a>.</p>

<p>
<dl><dt><b><tt class='function'>warning</tt></b>(<var>exception</var>)</dt>
<dd>
Receive notification of a warning.

<p>SAX parsers will use this method to report conditions that are not
errors or fatal errors as defined by the XML 1.0 recommendation. The
default behaviour is to take no action.</p>

The SAX parser must continue to provide normal parsing events after
invoking this method: it should still be possible for the application
to process the document through to the end.</dd></dl></p>

<p>
<dl><dt><b><tt class='function'>error</tt></b>(<var>exception</var>)</dt>
<dd>
Receive notification of a recoverable error.

<p>This corresponds to the definition of "error" in section 1.2 of the
W3C XML 1.0 Recommendation.  For example, a validating parser would use
this callback to report the violation of a validity constraint.  The
default behaviour is to take no action.</p>

The SAX parser must continue to provide normal parsing events after
invoking this method: it should still be possible for the application
to process the document through to the end.  If the application cannot
do so, then the parser should report a fatal error even if the XML 1.0
recommendation does not require it to do so.</dd></dl></p>

<p>
<dl><dt><b><tt class='function'>fatal_error</tt></b>(<var>exception</var>)</dt>
<dd>
Receive notification of a non-recoverable error.

<p>This corresponds to the definition of "fatal error" in section 1.2
of the W3C XML 1.0 Recommendation.  For example, a parser would use
this callback to report the violation of a well-formedness
constraint.</p>

The application must assume that the document is unusable after the
parser has invoked this method, and should continue (if at all) only
for the sake of collecting addition error messages: in fact, SAX
parsers are free to stop reporting any other events once this method
has been invoked.</dd></dl></p>

<h3>Lexical Events</h3>

<p>This is an optional extension handler for SAX2 to provide lexical
information about an XML document, such as comments and CDATA section
boundaries; XML readers are not required to support this handler.</p>

<p>The events in the lexical handler apply to the entire document, not
just to the document element, and all lexical handler events must
appear between the content handler's <tt>start_document()</tt> and
<tt>end_document()</tt> events.</p>

<p>To set the LexicalHandler for an XML reader, set the Feature
"<tt>http://xml.org/sax/handlers/LexicalHandler</tt>" on the parser to
the object to receive lexical events.  If the reader does not support
lexical events, it will throw a <tt>XML::SAX::Exception::NotRecognized</tt> or
a <tt>XML::SAX::Exception::NotSupported</tt> when you attempt to register the
handler.</p>

<p>
<dl><dt><b><tt class='function'>start_dtd</tt></b>(<var>dtd</var>)</dt>
<dd>
Report the start of DTD declarations, if any.

<p>Any declarations are assumed to be in the internal subset unless
otherwise indicated by a start_entity event.</p>

<p>Note that the <tt>start</tt>/<tt>end_dtd()</tt> events will appear
within the <tt>start</tt>/<tt>end_document()</tt> events from Content
Handler and before the first <tt>start_element()</tt> event.</p>

<p><var>dtd</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The document type name.</td></tr>
<tr><td><b><tt>PublicId</tt></b></td>
<td>The declared public identifier for the external DTD subset, or
<tt>undef</tt> if none was declared.</td></tr>
<tr><td><b><tt>SystemId</tt></b></td>
<td>The declared system identifier for the external DTD subset, or
<tt>undef</tt> if none was declared.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>end_dtd</tt></b>(<var>dtd</var>)</dt>
<dd>
Report the end of DTD declarations.

<p>No properties are defined for this event (<var>dtd</var> is
empty).</p></dd></dl></p>

<p>
<dl><dt><b><tt class='function'>start_entity</tt></b>(<var>entity</var>)</dt>
<dd>
Report the beginning of an entity in content.

<p><b>NOTE</b>: entity references in attribute values -- and the start
and end of the document entity -- are never reported.</p>

<p>The start and end of the external DTD subset are reported using the
pseudo-name "[dtd]". All other events must be properly nested within
start/end entity events.</p>

<p>Note that skipped entities will be reported through the
<tt>skipped_entity()</tt> event, which is part of the ContentHandler
interface.</p>

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The name of the entity. If it is a parameter entity, the
name will begin with '%'.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>end_entity</tt></b>(<var>entity</var>)</dt>
<dd>
Report the end of an entity.

<p><var>entity</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Name</tt></b></td>
<td>The name of the entity that is ending.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<p>
<dl><dt><b><tt class='function'>start_cdata</tt></b>(<var>cdata</var>)</dt>
<dd>
Report the start of a CDATA section.

<p>The contents of the CDATA section will be reported through the
regular characters event.</p>

<p>No properties are defined for this event (<var>cdata</var> is
empty).</p></dd></dl></p>

<p>
<dl><dt><b><tt class='function'>end_cdata</tt></b>(<var>cdata</var>)</dt>
<dd>
Report the end of a CDATA section.

<p>No properties are defined for this event (<var>cdata</var> is
empty).</p></dd></dl></p>

<p>
<dl><dt><b><tt class='function'>comment</tt></b>(<var>comment</var>)</dt>
<dd>
Report an XML comment anywhere in the document.

<p>This callback will be used for comments inside or outside the
document element, including comments in the external DTD subset (if
read).</p>

<p><var>comment</var> is a hash with these properties:</p>

<blockquote>
<table>
<tr><td><b><tt>Data</tt></b></td>
<td>The comment characters.</td></tr>
</table>
</blockquote></dd>
</dl></p>

<h2><a name="Filters">SAX Filters</a></h2>

<p>An XML filter is like an XML event generator, except that it
obtains its events from another XML event generator rather than a
primary source like an XML document or database.  Filters can modify a
stream of events as they pass on to the final application.</p>

<p>
<dl><dt><b><tt>Parent</tt></b></dt>
<dd>
The parent reader.

<p>This Feature allows the application to link the filter to a parent
event generator (which may be another filter).</p></dd></dl></p>

<p>
  See the XML::SAX::Base module for more on filters. It is meant to be
  used as a base class for filters and drivers, and makes them much
  easier to implement.
</p>

<h2><a name="Java">Java Compatibility</a></h2>

The Perl SAX 2.0 binding differs from the Java binding in these ways:

<ul>

<li>Takes parameters to <tt>new()</tt>, to <tt>parse()</tt>, and to be
set directly in the object, instead of requiring set/get calls (see
below).</li>

<li>Allows a default <tt>Handler</tt> parameter to be used for all
handlers.</li>

<li>
  No base classes are enforced. Instead, parsers dynamically
  check the handlers for what methods they support. Note however that
  using XML::SAX::Base as your base class for Drivers and Filters will
  make your code a lot simpler, less error prone, and probably much more
  correct with regard to this spec. Only reimplement that functionality
  if you really need to.
</li>

<li>The Attribute, InputSource, and SAXException (XML::SAX::Exception)
classes are only described as hashes (see below).</li>

<li>Handlers are passed a hash (Node) containing properties as an
argument instead of positional arguments.</li>

<li><tt>parse()</tt> methods return the value returned by calling the
<tt>end_document()</tt> handler.</li>

<li>
  Method names have been converted to lower-case with underscores.
  Parameters are all mixed case with initial upper-case.
</li>
</ul>

<p>
  If compatibility is a problem for you consider writing a Filter that
  converts from this style to the one you want. It is likely that such
  a Filter will be available from CPAN in the not distant future.
</p>

<!--
<p>[FIXME: need to list package/class name equivalents for all
hashes.]</p>
-->

</body>
</html>