/usr/share/perl5/XML/Compile/Schema.pod is in libxml-compile-perl 1.47-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 | =encoding utf8
=head1 NAME
XML::Compile::Schema - Compile a schema into CODE
=head1 INHERITANCE
XML::Compile::Schema
is a XML::Compile
XML::Compile::Schema is extended by
XML::Compile::Cache
=head1 SYNOPSIS
# compile tree yourself
my $parser = XML::LibXML->new;
my $tree = $parser->parse...(...);
my $schema = XML::Compile::Schema->new($tree);
# get schema from string
my $schema = XML::Compile::Schema->new($xml_string);
# get schema from file (most used)
my $schema = XML::Compile::Schema->new($filename);
my $schema = XML::Compile::Schema->new([glob "*.xsd"]);
# the "::Cache" extension has more power
my $schema = XML::Compile::Cache->new(\@xsdfiles);
# adding more schemas, from parsed XML
$schema->addSchemas($tree);
# adding more schemas from files
# three times the same: well-known url, filename in schemadir, url
# Just as example: usually not needed.
$schema->importDefinitions('http://www.w3.org/2001/XMLSchema');
$schema->importDefinitions('2001-XMLSchema.xsd');
$schema->importDefinitions(SCHEMA2001); # from ::Util
# alternatively
my @specs = ('one.xsd', 'two.xsd', $schema_as_string);
my $schema = XML::Compile::Schema->new(\@specs); # ARRAY!
# see what types are defined
$schema->printIndex;
# create and use a reader
use XML::Compile::Util qw/pack_type/;
my $elem = pack_type 'my-namespace', 'my-local-name';
# $elem eq "{my-namespace}my-local-name"
my $read = $schema->compile(READER => $elem);
my $data = $read->($xmlnode);
my $data = $read->("filename.xml");
# when you do not know the element type beforehand
use XML::Compile::Util qw/type_of_node/;
my $elem = type_of_node $xml->documentElement;
my $reader = $reader_cache{$type} # either exists
||= $schema->compile(READER => $elem); # or create
my $data = $reader->($xmlmsg);
# create and use a writer
my $doc = XML::LibXML::Document->new('1.0', 'UTF-8');
my $write = $schema->compile(WRITER => '{myns}mytype');
my $xml = $write->($doc, $hash);
$doc->setDocumentElement($xml);
# show result
print $doc->toString(1);
# to create the type nicely
use XML::Compile::Util qw/pack_type/;
my $type = pack_type 'myns', 'mytype';
print $type; # shows {myns}mytype
# using a compiled routines cache
use XML::Compile::Cache; # separate distribution
my $schema = XML::Compile::Cache->new(...);
# Show which data-structure is expected
print $schema->template(PERL => $type);
# Error handling tricks with Log::Report
use Log::Report mode => 'DEBUG'; # enable debugging
dispatcher SYSLOG => 'syslog'; # errors to syslog as well
try { $reader->($data) }; # catch errors in $@
=head1 DESCRIPTION
This module collects knowledge about one or more schemas. The most
important method provided is L<compile()|XML::Compile::Schema/"Compilers">, which can create XML file
readers and writers based on the schema information and some selected
element or attribute type.
Various implementations use the translator, and more can be added
later:
=over 4
=item C<< $schema->compile('READER'...) >> translates XML to HASH
The XML reader produces a HASH from a XML::LibXML::Node tree or an
XML string. Those represent the input data. The values are checked.
An error produced when a value or the data-structure is not according
to the specs.
The CODE reference which is returned can be called with anything
accepted by L<dataToXML()|XML::Compile/"Compilers">.
Example: create an XML reader
my $msgin = $rules->compile(READER => '{myns}mytype');
# or ... = $rules->compile(READER => pack_type('myns', 'mytype'));
my $xml = $parser->parse("some-xml.xml");
my $hash = $msgin->($xml);
or
my $hash = $msgin->('some-xml.xml');
my $hash = $msgin->($xml_string);
my $hash = $msgin->($xml_node);
=item C<< $schema->compile('WRITER', ...) >> translates HASH to XML
The writer produces schema compliant XML, based on a Perl HASH. To get
the data encoding correctly, you are required to pass a document object
in which the XML nodes may get a place later.
Create an XML writer
my $doc = XML::LibXML::Document->new('1.0', 'UTF-8');
my $write = $schema->compile(WRITER => '{myns}mytype');
my $xml = $write->($doc, $hash);
print $xml->toString;
alternative
my $write = $schema->compile(WRITER => 'myns#myid');
=item C<< $schema->template('XML', ...) >> creates an XML example
Based on the schema, this produces an XML message as example. Schemas
are usually so complex that people loose overview. This example may
put you back on track, and used as starting point for many creating the
XML version of the message.
=item C<< $schema->template('PERL', ...) >> creates an Perl example
Based on the schema, this produces an Perl HASH structure (a bit
like the output by Data::Dumper), which can be used as template
for creating messages. The output contains documentation, and is
usually much clearer than the schema itself.
=item C<< $schema->template('TREE', ...) >> creates a parse tree
To be able to produce Perl-text and XML examples, the templater
generates an abstract tree from the schema. That tree is returned
here. Be warned that the structure is not fixed over releases:
add regression tests for this to your project.
=back
Be warned that the B<schema is not validated>; you can develop schemas
which do work well with this module, but are not valid according to W3C.
In many cases, however, the translater will refuse to accept mistakes:
mainly because it cannot produce valid code.
Extends L<"DESCRIPTION" in XML::Compile|XML::Compile/"DESCRIPTION">.
=head1 METHODS
Extends L<"METHODS" in XML::Compile|XML::Compile/"METHODS">.
=head2 Constructors
Extends L<"Constructors" in XML::Compile|XML::Compile/"Constructors">.
=over 4
=item XML::Compile::Schema-E<gt>B<new>( [$xmldata], %options )
Details about many name-spaces can be organized with only a single
schema object (actually, the data is administered in an internal
L<XML::Compile::Schema::NameSpaces|XML::Compile::Schema::NameSpaces> object)
The initial information is extracted from the $xmldata source. The $xmldata
can be anything what is acceptable by L<importDefinitions()|XML::Compile::Schema/"Administration">, which
is everything accepted by L<dataToXML()|XML::Compile/"Compilers"> or an ARRAY of those things.
You may also add any OPTION accepted by L<addSchemas()|XML::Compile::Schema/"Accessors"> to guide the
understanding of the schema. When no $xmldata is provided, you can add
it later with L<importDefinitions()|XML::Compile::Schema/"Administration">
You can specify the hooks before you define the schemas the hooks
work on: all schema information and all hooks are only used when
the readers and writers get compiled.
-Option --Defined in --Default
block_namespace []
hook undef
hooks []
ignore_unused_tags <false>
key_rewrite []
parser_options XML::Compile <many>
schema_dirs XML::Compile undef
typemap {}
=over 2
=item block_namespace => NAMESPACE|TYPE|HASH|CODE|ARRAY
See L<blockNamespace()|XML::Compile::Schema/"Accessors">
=item hook => HOOK|ARRAY
See L<addHook()|XML::Compile::Schema/"Accessors">. Adds one HOOK (HASH) or more at once.
=item hooks => ARRAY
Add one or more hooks. See L<addHooks()|XML::Compile::Schema/"Accessors">.
=item ignore_unused_tags => BOOLEAN|REGEXP
(WRITER) Usually, a C<mistake> warning is produced when a user provides
a data structure which contains more data than is needed for the XML
message which is created; this will show structural problems. However,
in some cases, you may want to play tricks with the data-structure and
therefore disable this precausion.
With a REGEXP, you can have more control. Only keys which do match
the expression will be ignored silently. Other keys (usually typos
and other mistakes) will get reported. See L</Typemaps>
=item key_rewrite => HASH|CODE|ARRAY
Translate XML element local-names into different Perl keys.
See L</Key rewrite>.
=item parser_options => HASH|ARRAY
=item schema_dirs => DIRECTORY|ARRAY-OF-DIRECTORIES
=item typemap => HASH
HASH of Schema type to Perl object or Perl class. See L</Typemaps>, the
serialization of objects.
=back
=back
=head2 Accessors
Extends L<"Accessors" in XML::Compile|XML::Compile/"Accessors">.
=over 4
=item $obj-E<gt>B<addHook>($hook|LIST|undef)
A $hook is specified as HASH or a LIST of PAIRS. When C<undef>, this call
is ignored. See L<addHooks()|XML::Compile::Schema/"Accessors"> and L</Schema hooks> below.
=item $obj-E<gt>B<addHooks>( $hook, [$hook, ...] )
Add multiple hooks at once. These must all be HASHes. See L</Schema hooks>
and L<addHook()|XML::Compile::Schema/"Accessors">. C<undef> values are ignored.
=item $obj-E<gt>B<addKeyRewrite>($predef|CODE|HASH, ...)
Add new rewrite rules to the existing list (initially provided with
L<new(key_rewrite)|XML::Compile::Schema/"Constructors">). The whole list of rewrite rules is returned.
C<PREFIXED> rules will be applied first. Special care is taken that the
prefix will not be called twice. The last added set of rewrite rules
will be applied first. See L</Key rewrite>.
=item $obj-E<gt>B<addSchemaDirs>(@directories|$filename)
=item XML::Compile::Schema-E<gt>B<addSchemaDirs>(@directories|$filename)
Inherited, see L<XML::Compile/"Accessors">
=item $obj-E<gt>B<addSchemas>($xml, %options)
Collect all the schemas defined in the $xml data. The $xml parameter
must be a XML::LibXML node, therefore it is advised to use
L<importDefinitions()|XML::Compile::Schema/"Administration">, which has a much more flexible way to
specify the data.
When the object extends L<XML::Compile::Cache|XML::Compile::Cache>, the prefixes declared
on the schema element will be taken as default prefixes.
-Option --Default
attribute_form_default <undef>
element_form_default <undef>
filename undef
source undef
target_namespace <undef>
=over 2
=item attribute_form_default => 'qualified'|'unqualified'
=item element_form_default => 'qualified'|'unqualified'
Overrule the default as found in the schema. Many old schemas (like
WSDL11 and SOAP11) do not specify the correct default element form in
the schema but only in the text.
=item filename => FILENAME
Explicitly state from which file the data is coming.
=item source => STRING
An indication where this schema data was found. If you use L<dataToXML()|XML::Compile/"Compilers">
in LIST context, you get such an indication.
=item target_namespace => NAMESPACE
Overrule (or set) the target namespace in the schema.
=back
=item $obj-E<gt>B<addTypemap>(PAIR)
Synonym for L<addTypemap()|XML::Compile::Schema/"Accessors">.
=item $obj-E<gt>B<addTypemaps>(PAIRS)
Add new XML-Perl type relations. See L</Typemaps>.
=item $obj-E<gt>B<blockNamespace>($ns|$type|HASH|CODE|ARRAY)
Block all references to a $ns or full $type, as if they do not appear
in the schema. Specially useful if the schema includes references to
old (deprecated) versions of itself which are not being used. It can
also be used to block inclusion of huge structures which are not used,
for increased compile performance, or to avoid buggy constructs.
These values can also be passed with L<new(block_namespace)|XML::Compile::Schema/"Constructors"> and
L<compile(block_namespace)|XML::Compile::Schema/"Compilers">.
=item $obj-E<gt>B<hooks>( [<'READER'|'WRITER'>] )
Returns the LIST of defined hooks (as HASHes).
[1.36] When an action parameter is provided, it will only return a list
with hooks added with that action value or no action at all.
=item $obj-E<gt>B<useSchema>( $schema, [$schema, ...] )
Pass a L<XML::Compile::Schema|XML::Compile::Schema> object, or extensions like
L<XML::Compile::Cache|XML::Compile::Cache>, to be used as definitions as well. First,
elements are looked-up in the current schema definition object. If not
found the other provided $schema objects are checked in the order as
they were added.
Searches for definitions do not recurse into schemas which are used
by the used schema.
example: use other Schema
my $wsdl = XML::Compile::WSDL->new($wsdl);
my $geo = Geo::GML->new(version => '3.2.1');
# both $wsdl and $geo extend XML::Compile::Schema
$wsdl->useSchema($geo);
=back
=head2 Compilers
Extends L<"Compilers" in XML::Compile|XML::Compile/"Compilers">.
=over 4
=item $obj-E<gt>B<compile>( <'READER'|'WRITER'>, $type, %options )
Translate the specified ELEMENT (found in one of the read schemas) into
a CODE reference which is able to translate between XML-text and a HASH.
When the $type is C<undef>, an empty LIST is returned.
The indicated $type is the starting-point for processing in the
data-structure, a toplevel element or attribute name. The name must
be specified in C<{url}name> format, there the url is the name-space.
An alternative is the C<url#id> which refers to an element or type with
the specific C<id> attribute value.
When a READER is created, a CODE reference is returned which needs
to be called with XML, as accepted by L<XML::Compile::dataToXML()|XML::Compile/"Compilers">.
Returned is a nested HASH structure which contains the data from
contained in the XML. The transformation rules are explained below.
When a WRITER is created, a CODE reference is returned which needs
to be called with an XML::LibXML::Document object and a HASH, and
returns a XML::LibXML::Node.
Many %options below are B<explained in more detailed> in the manual-page
L<XML::Compile::Translate|XML::Compile::Translate>, which implements the compilation.
-Option --Default
abstract_types 'ERROR'
any_attribute undef
any_element undef
any_type <returns string or node>
attributes_qualified <undef>
block_namespace []
check_occurs <true>
check_values <true>
default_values <depends on backend>
elements_qualified <undef>
hook undef
hooks undef
ignore_facets <false>
ignore_unused_tags <false>
include_namespaces <true>
interpret_nillable_as_optional <false>
key_rewrite []
mixed_elements 'ATTRIBUTES'
namespace_reset <false>
output_namespaces undef
path <expanded name of type>
permit_href <false>
prefixes {}
sloppy_floats <false>
sloppy_integers <false>
typemap {}
use_default_namespace <false>
validation <true>
xsi_type {}
=over 2
=item abstract_types => 'ERROR'|'ACCEPT'
How to handle the use abstract types. Of course, they should not be
used, but sometime they accidentally are. When set to C<ERROR>, an error
will be produced whenever an abstract type is encountered.
C<ACCEPT> will ignore the fact that the types are abstract, and treat
them as non-abstract types.
=item any_attribute => CODE|'TAKE_ALL'|'SKIP_ALL'
[0.89] In general, C<anyAttribute> schema components cannot be handled
automatically. If you need to create or process anyAttribute
information, then read about wildcards in the DETAILS chapter of the
manual-page for the specific back-end.
[pre-0.89] this option was named C<anyElement>, which will still work.
=item any_element => CODE|'TAKE_ALL'|'SKIP_ALL'
[0.89] In general, C<any> schema components cannot be handled automatically.
If you need to create or process any information, then read about
wildcards in the DETAILS chapter of the manual-page for the specific
back-end.
[pre-0.89] this option was named C<anyElement>, which will still work.
=item any_type => CODE
[1.07] how to handle "anyType" type elements. Depends on the backend.
=item attributes_qualified => C<ALL>|C<NONE>|BOOLEAN
[1.44] Like option C<elements_qualified>, but then for attributes.
=item block_namespace => NAMESPACE|TYPE|HASH|CODE|ARRAY
See L<blockNamespace()|XML::Compile::Schema/"Accessors">.
=item check_occurs => BOOLEAN
Whether code will be produced to do bounds checking on elements and blocks
which may appear more than once. When the schema says that maxOccurs is 1,
then that element becomes optional. When the schema says that maxOccurs
is larger than 1, then the output is still always an ARRAY, but now of
unrestricted length.
=item check_values => BOOLEAN
Whether code will be produce to check that the XML fields contain
the expected data format.
Turning this off will improve the processing speed significantly, but is
(of course) much less safe. Do not set it off when you expect data from
external sources: validation is a crucial requirement for XML.
=item default_values => 'MINIMAL'|'IGNORE'|'EXTEND'
How to treat default values as provided by the schema.
With C<IGNORE> (the writer default), you will see exactly what is
specified in the XML or HASH. With C<EXTEND> (the reader default) will
show the default and fixed values in the result. C<MINIMAL> does remove
all fields which are the same as the default setting: simplifies.
See L</Default Values>.
=item elements_qualified => C<TOP>|C<ALL>|C<NONE>|BOOLEAN
When defined, this will overrule the namespace use on elements in
all schemas. When C<ALL> or a true value is given, then all elements
will be used qualified. When C<NONE> or a false value is given, the
XML will not produce or process prefixes on any element.
All top-level elements (and attributes) will be used in a name-space
qualified way, if they have a targetNamespace. Some applications require
some global element with qualification, so refuse global elements which
have no qualification. Using the C<TOP> setting, the compiler checks
that the targetNamespace exists.
The C<form> attributes in the schema will be respected; overrule the
effects of this option. Use hooks when you need to fix name-space use
in more subtile ways.
With C<element_form_default>, you can correct whole
schema's about their name-space behavior.
Change in [1.44]: C<TOP> before enforced a name-space on the top-level.
There should always be a name-space on the top element. It got changed
into that C<TOP> checks that the globals have a targetNamespace.
=item hook => HOOK|ARRAY-OF-HOOKS
Define one or more processing hooks. See L</Schema hooks> below.
These hooks are only active for this compiled entity, where L<addHook()|XML::Compile::Schema/"Accessors">
and L<addHooks()|XML::Compile::Schema/"Accessors"> can be used to define hooks which are used for all
results of L<compile()|XML::Compile::Schema/"Compilers">. The hooks specified with the C<hook> or C<hooks>
option are run before the global definitions.
=item hooks => HOOK|ARRAY-OF-HOOKS
Alternative for option C<hook>.
=item ignore_facets => BOOLEAN
Facets influence the formatting and range of values. This does
not come cheap, so can be turned off. It affects the restrictions
set for a simpleType. The processing speed will improve, but validation
is a crucial requirement for XML: please do not turn this off when the
data comes from external sources.
=item ignore_unused_tags => BOOLEAN|REGEXP
Overrules what is set with L<new(ignore_unused_tags)|XML::Compile::Schema/"Constructors">.
=item include_namespaces => BOOLEAN|CODE
Indicates whether the WRITER should include the prefix to namespace
translation on the top-level element of the returned tree. If not,
you may continue with the same name-space table to combine various
XML components into one, and add the namespaces later. No namespace
definition can be added the production rule produces an attribute.
When a CODE reference is passed, it will be called for each namespace
to decide whether it should be included or not. When true, it will
we added. The CODE is called with a namespace, its prefix, and the
number of times it was used for that schema element translator.
=item interpret_nillable_as_optional => BOOLEAN
Found in the schema wild-life: people who think that nillable means
optional. Not too hard to fix. For the WRITER, you still have to state
NIL explicitly, but the elements are not constructed. The READER will
output NIL when the nillable elements are missing.
=item key_rewrite => HASH|CODE|ARRAY
Add key rewrite rules to the front of the list of rules, as set by
L<new(key_rewrite)|XML::Compile::Schema/"Constructors"> and L<addKeyRewrite()|XML::Compile::Schema/"Accessors">. See L</Key rewrite>
=item mixed_elements => CODE|PREDEFINED
What to do when mixed schema elements are to be processed. Read
more in the L</DETAILS> section below.
=item namespace_reset => BOOLEAN
Use the same prefixes in C<prefixes> as with some other compiled
piece, but reset the counts to zero first.
=item output_namespaces => HASH|ARRAY-of-PAIRS
[Pre-0.87] name for the C<prefixes> option. Deprecated.
=item path => STRING
Prepended to each error report, to indicate the location of the
error in the XML-Scheme tree.
=item permit_href => BOOLEAN
When parsing SOAP-RPC encoded messages, the elements may have a C<href>
attribute pointing to an object with C<id>. The READER will return the
unparsed, unresolved node when the attribute is detected, and the SOAP-RPC
decoder will have to discover and resolve it.
=item prefixes => HASH|ARRAY-of-PAIRS
Can be used to pre-define prefixes for namespaces (for 'WRITER' or
key rewrite) for instance to reserve common abbreviations like C<soap>
for external use. Each entry in the hash has as key the namespace uri.
The value is a hash which contains C<uri>, C<prefix>, and C<used> fields.
Pass a reference to a private hash to catch this index. An ARRAY with
prefix, uri PAIRS is simpler.
prefixes => [ mine => $myns, two => $twons ]
prefixes => { $myns => 'mine', $twons => 'two' }
# the previous is short for:
prefixes => { $myns => [ uri => $myns, prefix => 'mine', used => 0 ]
, $twons => [ uri => $twons, prefix => 'two', ...] };
=item sloppy_floats => BOOLEAN
The float types of XML are all quite big, and support NaN, INF, and -INF.
Perl's normal floats do not, and therefore Math::BigFloat is used. This,
however, is slow. When true, you will crash on any value which is not
understood by Perl's default float... but run much faster. See also
C<sloppy_integers>.
=item sloppy_integers => BOOLEAN
The XML C<integer> data-types must support at least 18 digits,
which is larger than Perl's 32 bit internal integers. Therefore, the
implementation will use Math::BigInt objects to handle them. However,
often an simple C<int> type whould have sufficed, but the XML designer
was lazy. A long is much faster to handle. Set this flag to use C<int>
as fast (but inprecise) replacements.
Be aware that C<Math::BigInt> and C<Math::BigFloat> objects are nearly
but not fully transparently mimicing the behavior of Perl's ints and
floats. See their respective manual-pages. Especially when you wish
for some performance, you should optimize access to these objects to
avoid expensive copying which is exactly the spot where the differences
are.
You can also improve the speed of Math::BigInt by installing
Math::BigInt::GMP. Add C<< use Math::BigInt try => 'GMP'; >> to the
top of your main script to get more performance.
=item typemap => HASH
Add this typemap to the relations defined by L<new(typemap)|XML::Compile::Schema/"Constructors"> or
L<addTypemaps()|XML::Compile::Schema/"Accessors">
=item use_default_namespace => BOOLEAN
[0.91] When mixing qualified and unqualified namespaces, then the use of
a default namespace can be quite confusing: a name-space without prefix.
Therefore, by default, all qualified elements will have an explicit prefix.
=item validation => BOOLEAN
XML message must be validated, to lower the chance on abuse. However,
of course, it costs performance which is only partially compensated by
fewer checks in your code. This flag overrules the C<check_values>,
C<check_occurs>, and C<ignore_facets>.
=item xsi_type => HASH
See L</Handling xsi:type>. The HASH maps types as mentioned in the schema,
to extensions of those types which are addressed via the horrible C<xsi:type>
construct. When you specify C<AUTO> as value, the translator tries to
auto-detect. This may be slow and may produce incomplete results.
=back
=item $obj-E<gt>B<dataToXML>($node|REF-XML|XML-STRING|$filename|$fh|$known)
=item XML::Compile::Schema-E<gt>B<dataToXML>( $node|REF-XML|XML-STRING|$filename|$fh|$known )
Inherited, see L<XML::Compile/"Compilers">
=item $obj-E<gt>B<initParser>(%options)
=item XML::Compile::Schema-E<gt>B<initParser>(%options)
Inherited, see L<XML::Compile/"Compilers">
=item $obj-E<gt>B<template>( <'XML'|'PERL'|'TREE'>, $element, %options )
Schema's can be horribly complex and unreadible. Therefore, this template
method can be called to create an example which demonstrates how data
of the specified $element shown as XML or Perl is organized in practice.
The 'TREE' template returns the intermediate parse tree, which gets
formatted into the XML or Perl example. This is not a very stable
interface: it may change without much notice.
Some %options are explained in L<XML::Compile::Translate|XML::Compile::Translate>. There are
some extra %options defined for the final output process.
The templates produced are B<not always correct>. Please contribute
improvements: read and understand the comments in the text.
-Option --Default
abstract_types 'ERROR'
attributes_qualified <undef>
elements_qualified <undef>
include_namespaces <true>
indent " "
key_rewrite []
show_comments ALL
skip_header <false>
=over 2
=item abstract_types => 'ERROR'|'ACCEPT'
By default, do not show abstract types in the output.
=item attributes_qualified => BOOLEAN
=item elements_qualified => 'ALL'|'TOP'|'NONE'|BOOLEAN
=item include_namespaces => BOOLEAN|CODE
=item indent => STRING
The leading indentation string per nesting. Must start with at least one
blank.
=item key_rewrite => HASH|CODE|ARRAY
=item show_comments => STRING|'ALL'|'NONE'
A comma separated list of tokens, which explain what kind of comments need
to be included in the output. The available tokens are: C<struct>, C<type>,
C<occur>, C<facets>. A value of C<ALL> will select all available comments.
The C<NONE> or empty string will exclude all comments.
=item skip_header => BOOLEAN
Skip the comment header from the output.
=back
=back
=head2 Administration
Extends L<"Administration" in XML::Compile|XML::Compile/"Administration">.
=over 4
=item $obj-E<gt>B<doesExtend>($exttype, $basetype)
Returns true when the $exttype extends the $basetype. See
L<XML::Compile::Schema::NameSpaces::doesExtend()|XML::Compile::Schema::NameSpaces/"Accessors">
=item $obj-E<gt>B<elements>()
List all elements, defined by all schemas sorted alphabetically.
=item $obj-E<gt>B<findSchemaFile>($filename)
=item XML::Compile::Schema-E<gt>B<findSchemaFile>($filename)
Inherited, see L<XML::Compile/"Administration">
=item $obj-E<gt>B<importDefinitions>($xmldata, %options)
Import (include) the schema information included in the $xmldata. The
$xmldata must be acceptable for L<dataToXML()|XML::Compile/"Compilers">. The resulting node
and all the %options are passed to L<addSchemas()|XML::Compile::Schema/"Accessors">. The schema node does
not need to be the top element: any schema node found in the data
will be decoded.
Returned is a list of L<XML::Compile::Schema::Instance|XML::Compile::Schema::Instance> objects,
for each processed schema component.
If your program imports the same string or file definitions multiple
times, it will re-use the schema information from the first import.
This removal of dupplications will not work for open files or pre-parsed
XML structures.
As an extension to the handling L<dataToXML()|XML::Compile/"Compilers"> provides, you can specify an
ARRAY of things which are acceptable to C<dataToXML>. This way, you can
specify multiple resources at once, each of which will be processed with
the same %options.
-Option --Default
details <from XMLDATA>
=over 2
=item details => HASH
Overrule the details information about the source of the data.
=back
example: of use of importDefinitions
my $schema = XML::Compile::Schema->new;
$schema->importDefinitions('my-spec.xsd');
my $other = "<schema>...</schema>"; # use 'HERE' documents!
my @specs = ('my-spec.xsd', 'types.xsd', $other);
$schema->importDefinitions(\@specs, @options);
=item $obj-E<gt>B<knownNamespace>($ns|PAIRS)
=item XML::Compile::Schema-E<gt>B<knownNamespace>($ns|PAIRS)
Inherited, see L<XML::Compile/"Administration">
=item $obj-E<gt>B<namespaces>()
Returns the L<XML::Compile::Schema::NameSpaces|XML::Compile::Schema::NameSpaces> object which is used
to collect schemas.
=item $obj-E<gt>B<printIndex>( [$fh], %options )
Print all the elements which are defined in the schemas to the $fh
(by default the selected handle). %options are passed to
L<XML::Compile::Schema::NameSpaces::printIndex()|XML::Compile::Schema::NameSpaces/"Accessors"> and
L<XML::Compile::Schema::Instance::printIndex()|XML::Compile::Schema::Instance/"Index">.
=item $obj-E<gt>B<types>()
List all types, defined by all schemas sorted alphabetically.
=item $obj-E<gt>B<walkTree>($node, CODE)
Inherited, see L<XML::Compile/"Administration">
=back
=head1 DETAILS
Extends L<"DETAILS" in XML::Compile|XML::Compile/"DETAILS">.
=head2 Distribution collection overview
Extends L<"Distribution collection overview" in XML::Compile|XML::Compile/"Distribution collection overview">.
=head2 Comparison
Extends L<"Comparison" in XML::Compile|XML::Compile/"Comparison">.
=head2 Collecting definitions
When starting an application, you will need to read the schema
definitions. This is done by instantiating an object via
L<XML::Compile::Schema::new()|XML::Compile::Schema/"Constructors"> or L<XML::Compile::WSDL11::new()|XML::Compile::WSDL11/"Constructors">.
The WSDL11 object has a schema object internally.
Schemas may contains C<import> and C<include> statements, which
specify other resources for definitions. In the idea of the XML design
team, those files should be retrieved automatically via an internet
connection from the C<schemaLocation>. However, this is a bad concept; in
XML::Compile modules you will have to explicitly provide filenames on local
disk using L<importDefinitions()|XML::Compile::Schema/"Administration"> or L<XML::Compile::WSDL11::addWSDL()|XML::Compile::WSDL11/"Extension">.
There are various reasons why I, the author of this module, think the
dynamic automatic internet imports are a bad idea. First: you do not
always have a working internet connection (travelling with a laptop in
a train). Your implementation should work the same way under all
environmental circumstances! Besides, I do not trust remote files on
my system, without inspecting them. Most important: I want to run my
regression tests before using a new version of the definitions, so I do
not want to have a remote server change the agreements without my
knowledge.
So: before you start, you will need to scan (recursively) the initial
schema or wsdl file for C<import> and C<include> statements, and
collect all these files from their C<schemaLocation> into files on
local disk. In your program, call L<importDefinitions()|XML::Compile::Schema/"Administration"> on all of
them -in any order- before you call L<compile()|XML::Compile::Schema/"Compilers">.
=head3 Organizing your definitions
One nice feature to help you organize (especially useful when you
package your code in a distribution), is to add these lines to the
beginning of your code:
package My::Package;
XML::Compile->addSchemaDirs(__FILE__);
XML::Compile->knownNamespace('http://myns' => 'myns.xsd', ...);
Now, if the package file is located at C<SomeThing/My/Package.pm>,
the definion of the namespace should be kept in
C<SomeThing/My/Package/xsd/myns.xsd>.
Somewhere in your program, you have to load these definitions:
# absolute or relative path is always possible
$schema->importDefinitions('SomeThing/My/Package/xsd/myns.xsd');
# relative search path extended by addSchemaDirs
$schema->importDefinitions('myns.xsd');
# knownNamespace improves abstraction
$schema->importDefinitions('http://myns');
Very probably, the namespace is already in some variable:
use XML::Compile::Schema;
use XML::Compile::Util 'pack_type';
my $myns = 'http://some-very-long-uri';
my $schema = XML::Compile::Schema->new($myns);
my $mytype = pack_type $myns, $myelement;
my $reader = $schema->compileClient(READER => $mytype);
=head2 Addressing components
Normally, external users can only address elements within a schema,
and types are hidden to be used by other schemas only. For this
reason, it is permitted to create an element and a type with the
same name.
The compiler requires a starting-point. This can either be an
element name or an element's id. The format of the element name
is C<{namespace-uri}localname>, for instance
{http://library}book
You may also start with
http://www.w3.org/2001/XMLSchema#float
as long as this ID refers to a top-level element, not a type.
When you use a schema without C<targetNamespace> (which is bad practice,
but sometimes people really do not understand the beneficial aspects of
the use of namespaces) then the elements can be addressed as C<{}name>
or simple C<name>.
=head2 Representing data-structures
The code will do its best to produce a correct translation. For
instance, an accidental C<1.9999> will be converted into C<2>
when the schema says that the field is an C<int>. It will also
strip superfluous blanks when the data-type permits. Especially
watch-out for the C<Integer> types, which produce Math::BigInt
objects unless L<compile(sloppy_integers)|XML::Compile::Schema/"Compilers"> is used.
Elements can be complex, and themselve contain elements which
are complex. In the Perl representation of the data, this will
be shown as nested hashes with the same structure as the XML.
You should not take tare of character encodings, whereas XML::LibXML is
doing that for us: you shall not escape characters like "E<lt>" yourself.
The schemas define kinds of data types. There are various ways to define
them (with restrictions and extensions), but for the resulting data
structure is that knowledge not important.
=head3 simpleType
A single value. A lot of single value data-types are built-in (see
L<XML::Compile::Schema::BuiltInTypes|XML::Compile::Schema::BuiltInTypes>).
Simple types may have range limiting restrictions (facets), which will
be checked by default. Types may also have some white-space behavior,
for instance blanks are stripped from integers: before, after, but also
inside the number representing string.
Note that some of the reader hooks will alter the single value of these
elements into a HASH like used for the complexType/simpleContent (next
paragraph), to be able to return some extra collected information.
=head3 complexType/simpleContent
In this case, the single value container may have attributes. The number
of attributes can be endless, and the value is only one. This value
has no name, and therefore gets a predefined name C<_>.
When passed to the writer, you may specify a single value (not the whole
HASH) when no attributes are used.
=head3 complexType and complexType/complexContent
These containers not only have attributes, but also multiple values
as content. The C<complexContent> is used to create inheritance
structures in the data-type definition. This does not affect the
XML data package itself.
=head3 Manually produced XML NODE
For a WRITER, you may also specify a XML::LibXML::Node anywhere.
test1 => $doc->createTextNode('42');
test3 => $doc->createElement('ariba');
This data-structure is used without validation, so you are fully on
your own with this one. Typically, nodes are produced by hooks to
implement work-arounds.
=head3 Occurence
A second factor which determines the data-structure is the element
occurrence. Usually, elements have to appear once and exactly once
on a certain location in the XML data structure. This order is
automatically produced by this module. But elements may appear multiple
times.
=over 4
=item usual case
The default behavior for an element (in a sequence container) is to
appear exactly once. When missing, this is an error.
=item maxOccurs larger than 1
In this case, the element or particle block can appear multiple times.
Multiple values are kept in an ARRAY within the HASH. Non-schema based
XML modules do not return a single value as an ARRAY, which makes that
code more complicated. But in our case, we know the expected amount
beforehand.
When the maxOccurs larger than 1 is specified for an element, an ARRAY
of those elements is produced. When it is specified for a block (sequence,
choice, all, group), then an ARRAY of HASHes is returned. See the special
section about this subject.
An error is produced when the number of elements found is less than
C<minOccurs> (defaults to 1) or more than C<maxOccurs> (defaults to 1),
unless L<compile(check_occurs)|XML::Compile::Schema/"Compilers"> is C<false>.
Example elements with maxOccurs larger than 1. In the schema:
<element name="a" type="int" maxOccurs="unbounded" />
<element name="b" type="int" />
In the XML message:
<a>12</a><a>13</a><b>14</b>
In the Perl representation:
a => [12, 13], b => 14
=item value is C<NIL>
When an element is nillable, that is explicitly represented as a C<NIL>
constant string.
=item use="optional" or minOccurs="0"
The element may be skipped. When found it is a single value.
=item use="forbidden"
When the element is found, an error is produced.
=item default="value"
When the XML does not contain the element, the default value is
used... but only if this element's container exists. This has
no effect on the writer.
=item fixed="value"
Produce an error when the value is not present or different (after
the white-space rules where applied).
=back
=head3 Default Values
[added in v0.91]
With L<compile(default_values)|XML::Compile::Schema/"Compilers"> you can control how much information about
default values defined by the schema will be passed into your program.
The choices, available for both READER and WRITER, are:
=over 4
=item C<IGNORE> (the WRITER's standard behavior)
Only include element and attribute values in the result if they are in
the XML message. Behaviorally, this treats elements with default values
as if they are just optional. The WRITER does not try to be smarter than
you.
=item C<EXTEND> (the READER's standard behavior)
If some element or attribute is not in the source but has a default in
the schema, that value will be produced. This is very convenient for the
READER, because your application does not have to hard-code the same
constant values as defaults as well.
=item C<MINIMAL>
Only produce the values which differ from the defaults. This choice is
useful when producing XML, to reduce the size of the output.
=back
=head3 Repetative blocks
Particle blocks come in four shapes: C<sequence>, C<choice>, C<all>,
and C<group> (an indirect block). This also affects C<substitutionGroups>.
=head4 repetative sequence, choice, all
In situations like this:
<element name="example">
<complexType>
<sequence>
<element name="a" type="int" />
<sequence>
<element name="b" type="int" />
</sequence>
<element name="c" type="int" />
</sequence>
</complexType>
</element>
(yes, schemas are verbose) the data structure is
<example> <a>1</a> <b>2</b> <c>3</c> </example>
the Perl representation is I<flattened>, into
example => { a => 1, b => 2, c => 3 }
Ok, this is very simple. However, schemas can use repetition:
<element name="example">
<complexType>
<sequence>
<element name="a" type="int" />
<sequence minOccurs="0" maxOccurs="unbounded">
<element name="b" type="int" />
</sequence>
<element name="c" type="int" />
</sequence>
</complexType>
</element>
The XML message may be:
<example> <a>1</a> <b>2</b> <b>3</b> <b>4</b> <c>5</c> </example>
Now, the perl representation needs to produce an array of the data in
the repeated block. This array needs to have a name, because more of
these blocks may appear together in a construct. The B<name of the
block> is derived from the I<type of block> and the name of the I<first
element> in the block, regardless whether that element is present in
the data or not.
So, our example data is translated into (and vice versa)
example =>
{ a => 1
, seq_b => [ {b => 2}, {b => 3}, {b => 4} ]
, c => 5
}
The following label is used, based on the name of the first element (say C<xyz>)
as defined in the schema (not in the actual message):
seq_xyz sequence with maxOccurs > 1
cho_xyz choice with maxOccurs > 1
all_xyz all with maxOccurs > 1
When you have L<compile(key_rewrite)|XML::Compile::Schema/"Compilers"> option PREFIXED, and you have explicitly
assigned the prefix C<xs> to the schema namespace (See L<compile(prefixes)|XML::Compile::Schema/"Compilers">),
then those names will respectively be C<seq_xs_xyz>, C<cho_xs_xyz>,
C<all_xs_xyz>.
=head4 repetative groups
[behavioral change in 0.93]
In contrast to the normal partical blocks, as described above, do the
groups have names. In this case, we do not need to take the name of
the first element, but can use the group name. It will still have C<gr_>
appended, because groups can have the same name as an element or a type(!)
Blocks within the group definition cannot be repeated.
=head4 repetative substitutionGroups
For B<substitutionGroup>s which are repeating, the I<name of the base
element> is used (the element which has attribute C<<abstract="true">>.
We do need this array, because the order of the elements within the group
may be important; we cannot group the elements based to the extended
element's name.
In an example substitutionGroup, the Perl representation will be
something like this:
base-element-name =>
[ { extension-name => $data1 }
, { other-extension => $data2 }
]
Each HASH has only one key.
=head3 List type
List simpleType objects are also represented as ARRAY, like elements
with a minOccurs or maxOccurs unequal 1.
=head3 Using substitutionGroup constructs
A substitution group is kind-of choice between alternative (complex)
types. However, in this case roles have reversed: instead a C<choice>
which lists the alternatives, here the alternative elements register
themselves as valid for an abstract (I<head>) element. All alternatives
should be extensions of the head element's type, but there is no way to
check that.
=head3 Wildcards any and anyAttribute
The C<any> and C<anyAttribute> elements are referred to as C<wildcards>:
they specify groups of elements and attributes which can be used, in
stead of being explicit.
The author of this module advices B<against the use of wildcards> in
schemas, because the purpose of schemas is to be explicit about the
structure of the message, and that basic idea is simply thrown away by
these wildcards. Let people cleanly extend the schema with inheritance!
If you use a standard schema which facilitates these wildcards, then
please do not use them!
Because wildcards are not explicit about the types to expect, the
C<XML::Compile> module can not prepare for them automatically.
However, as user of the schema you probably know better about the possible
contents of these fields. Therefore, you can translate that
knowledge into code explicitly. Read about the processing of wildcards
in the manual page for each of the back-ends, because it is different
in each case.
=head3 ComplexType with "mixed" attribute
[largely improved in 0.86, reader only]
ComplexType and ComplexContent components can be declared with the
C<<mixed="true">> attribute. This implies that text is not limited
to the content of containers, but may also be used inbetween elements.
Usually, you will only find ignorable white-space between elements.
In this example, the C<a> container is marked to be mixed:
<a> before <b>2</b> after </a>
Each back-end has its own way of handling mixed elements. The
L<compile(mixed_elements)|XML::Compile::Schema/"Compilers"> currently only modifies the reader's
behavior; the writer's capabilities are limited.
See L<XML::Compile::Translate::Reader|XML::Compile::Translate::Reader>.
=head3 hexBinary and base64Binary
These are used to include images and such in an XML message. Usually,
they are quite large with respect to the other elements. When you use
SOAP, you may wish to use L<XML::Compile::XOP|XML::Compile::XOP> instead.
The element values which you need to pass for fields of these
types is a binary BLOB, something Perl does not have. So, it is
a string containing binary data but not specially marked that way.
If you need to store an integer in such a binary field, you first have
to promote it into a BLOB (string) like this
{ color => pack('N', $i) } # writer
my $i = unpack('N', $d->{color}); # reader
Module Geo::KML implements a nice hook to avoid the explicit need
for this C<pack> and C<unpack>. The KML schema designers liked colors
to be written as C<ffc0c0c0> and abused C<hexBinary> for that purpose.
The C<colorType> fields in KML are treated as binary, but just represent
an int. Have a look in that Geo::KML code if your schema has some of
those tricks.
=head2 Schema hooks
You can use hooks, for instance, to block processing parts of the message,
to create work-arounds for schema bugs, or to extract more information
during the process than done by default.
=head3 Defining hooks
Multiple hooks can active during the compilation process of a type,
when C<compile()> is called. During Schema translation, each of the
hooks is checked for all types which are processed. When multiple
hooks select the object to get a modified behavior, then all are
evaluated in order of definition.
Defining a B<global> hook (where HOOKDATA is the LIST of PAIRS with
hook parameters, and HOOK a HASH with such HOOKDATA):
my $schema = XML::Compile::Schema->new
( ...
, hook => HOOK
, hooks => [ HOOK, HOOK ]
);
$schema->addHook(HOOKDATA | HOOK);
$schema->addHooks(HOOK, HOOK, ...);
my $wsdl = XML::Compile::WSDL->new(...);
$wsdl->addHook(HOOKDATA | HOOK);
B<local> hooks are only used for one reader or writer. They are
evaluated before the global hooks.
my $reader = $schema->compile(READER => $type
, hook => HOOK, hooks => [ HOOK, HOOK, ...]);
=head3 General syntax
Each hook has three kinds of parameters:
=over 4
=item . selectors
=item . processors
=item . action ('READER' or 'WRITER', defaults to both)
=back
Selectors define the schema component of which the processing is modified.
When one of the selectors matches, the processing information for the hook
is used. When no selector is specified, then the hook will be used on all
elements.
Available selectors (see below for details on each of them):
=over 4
=item . type
=item . id
=item . path
=back
As argument, you can specify one element as STRING, a regular expression
to select multiple elements, or an ARRAY of STRINGs and REGEXes.
Next to where the hook is placed, we need to known what to do in
the case: the hook contains processing information. When more than
one hook matches, then all of these processors are called in order
of hook definition. However, first the compile hooks are taken,
and then the global hooks.
How the processing works exactly depends on the compiler back-end. There
are major differences. Each of those manual-pages lists the specifics.
The label tells us when the processing is initiated. Available labels are
C<before>, C<replace>, and C<after>.
=head3 Hooks on matching types
The C<type> selector specifies a complexType of simpleType by name.
Best is to base the selection on the full name, like C<{ns}type>,
which will avoid all kinds of name-space conflicts in the future.
However, you may also specify only the C<type> (in any name-space).
Any REGEX will be matched to the full type name. Be careful with the
pattern archors.
If you use L<XML::Compile::Cache|XML::Compile::Cache> [release 0.90], then you can use
C<prefix:type> as type specification as well. You have to explicitly
define prefix to namespace beforehand.
=head3 Hooks on matching ids
Matching based on IDs can reach more schema elements: some types are
anonymous but still have an ID. Best is to base selection on the full
ID name, like C<ns#id>, to avoid all kinds of name-space conflicts in
the future.
=head3 Hooks on matching paths
When you see error messages, you always see some representation of
the path where the problem was discovered. You can use this path
as selector, when you know what it is... BE WARNED, that the current
structure of the path is not really consequent hence will be
improved in one of the future releases, breaking backwards compatibility.
=head2 Typemaps
Often, XML will be used in object oriented programs, where the facts
which are transported in the XML message are attributes of Perl objects.
Of course, you can always collect the data from each of the Objects into
the required (huge) HASH manually, before triggering the reader or writer.
As alternative, you can connect types in the XML schema with Perl objects
and classes, which results in cleaner code.
You can also specify typemaps with L<new(typemap)|XML::Compile::Schema/"Constructors">, L<addTypemaps()|XML::Compile::Schema/"Accessors">, and
L<compile(typemap)|XML::Compile::Schema/"Compilers">. Each type will only refer to the last map for that
type. When an C<undef> is given for a type, then the older definition
will be cancelled. Examples of the three ways to specify typemaps:
my %map = ($x1 => $p1, $x2 => $p2);
my $schema = XML::Compile::Schema->new(...., typemap => \%map);
$schema->addTypemaps($x3 => $p3, $x4 => $p4, $x1 => undef);
my $call = $schema->compile(READER => $type, typemap => \%map);
The latter only has effect for the type being compiled. The definitions
are cumulative. In the second example, the C<$x1> gets disabled.
Objects can come in two shapes: either they do support the connection
with XML::Compile (implementing two methods with predefined names), or
they don't, in which case you will need to write a little wrapper.
use XML::Compile::Util qw/pack_type/;
my $t1 = pack_type $myns, $mylocal;
$schema->typemap($t1 => 'My::Perl::Class');
$schema->typemap($t1 => $some_object);
$schema->typemap($t1 => sub { ... });
The implementation of the READER and WRITER differs. In the READER case,
the typemap is implemented as an 'after' hook which calls a C<fromXML>
method. The WRITER is a 'before' hook which calls a C<toXML> method.
See respectively the L<XML::Compile::Translate::Reader|XML::Compile::Translate::Reader> and
L<XML::Compile::Translate::Writer|XML::Compile::Translate::Writer>.
=head3 Private variables in objects
When you design a new object, it is possible to store the information
exactly like the corresponding XML type definition. The only thing
the C<fromXML> has to do, is bless the data-structure into its class:
$schema->typemap($xmltype => 'My::Perl::Class');
package My::Perl::Class;
sub fromXML { bless $_[1], $_[0] } # for READER
sub toXML { $_[0] } # for WRITER
However... the object may also need so need some private variables.
If you store them in the same HASH for your object, you will get
"unused tags" warnings from the writer. To avoid that, choose one
of the following alternatives:
# never complain about unused tags
::Schema->new(..., ignore_unused_tags => 1);
# only complain about unused tags not matching regexp
my $not_for_xml = qr/^[A-Z]/; # my XML only has lower-case
::Schema->new(..., ignore_unused_tags => $not_for_xml);
# only for one compiled WRITER (not used with READER)
::Schema->compile(..., ignore_unused_tags => 1);
::Schema->compile(..., ignore_unused_tags => $not_for_xml);
=head3 Typemap limitations
There are some things you need to know:
=over 4
=item .
Many schemas define very complex types. These may often not translate
cleanly into objects. You may need to create a typemap relation for
some parent type. The CODE reference may be very useful in this case.
=item .
A same kind of problem appears when you have a list in your object,
which often is not named in the schema.
=back
=head2 Handling xsi:type
[1.10] The C<xsi:type> is an old-fashioned mechanism, and should be avoided!
In this case, the schema does tell you that a certain element has
a certrain type, but at run-time(!) that is changed. When an XML
element has a C<xsi:type> attribute, it tells you simply to have an
extension of the original type. This whole mechanism does bite the
"compilation" idea of L<XML::Compile|XML::Compile>... however with some help, it
will work.
To make C<xsi:type> work at run-time, you have to pass a table of
which types you expect at compile-time. Example:
my %xsi_type_table =
( $base_type1 => [ $ext1_of_type1, $ext2_of_type2 ]
, $base_type2 => [ $ext1_of_type2 ]
);
my $r = $schema->compile(READER => $type
, xsi_type => \%xsi_type_table
);
When your schema is an L<XML::Compile::Cache|XML::Compile::Cache> (version at least 0.93),
your types look like C<prefix:local>. With a plain L<XML::Compile::Schema|XML::Compile::Schema>,
they will look like C<{namespace}local>, typically produced with
L<XML::Compile::Util::pack_type()|XML::Compile::Util/"Packing">.
When used in a reader, the resulting data-set will contain a C<XSI_TYPE>
key inbetween the facts which were taken from the element. The type is
is long syntax C<"{$ns}$type">. See L<XML::Compile::Util::unpack_type()|XML::Compile::Util/"Packing">
With the writer, you have to provide such an C<XSI_TYPE> value or the
element's base type will be used (and no C<xsi:type> attribute created).
This will probably cause warnings about unused tags. The type can be
provided in full (see L<XML::Compile::Util::pack_type()|XML::Compile::Util/"Packing">) or [1.31]
prefixed.
[1.25] then the value is not an ARRAY, but only the keyword C<AUTO>,
the parser will try to auto-detect all types which are valid alternatives.
This currently only works for non-builtin types. The auto-detection might
be slow and (because many schemas are broken) not produce a complete list.
When debugging is enabled ("use Log::Report mode => 3;") you will see to
which list this AUTO gets expanded.
xsi_type => { $base_type => 'AUTO' } # requires X::C v1.25
L<XML::Compile::Cache|XML::Compile::Cache> (since v1.01) makes using C<xsi:type> easier. When
you have a ::Cache based object (for instance a L<XML::Compile::WSDL11|XML::Compile::WSDL11>)
you can simply say
$wsdl->addXsiType( $base_type => 'AUTO' )
Now, you do not need to pass the xsi table to each compilation call.
=head2 Key rewrite
[improved with release 1.10]
The standard practice is to use the localName of the XML elements as
key in the Perl HASH; the key rewrite mechanism is used to change that,
sometimes to separate elements which have the same localName within
different name-spaces, or when an element and an attribute share a name
(key rewrite is applied to elements AND attributes) in other cases just
for fun or convenience.
Rewrite rules are interpreted at "compile-time", which means that they
B<do not slow-down> the XML construction or deconstruction. The rules
work the same for readers and writers, because they are applied to
name found in the schema.
Key rewrite rules can be set during schema object initiation
with L<new(key_rewrite)|XML::Compile::Schema/"Constructors"> and to an existing schema object with
L<addKeyRewrite()|XML::Compile::Schema/"Accessors">. These rules will be used in all calls to
L<compile()|XML::Compile::Schema/"Compilers">.
Next, you can use L<compile(key_rewrite)|XML::Compile::Schema/"Compilers"> to add rules which
are only used for a single compilation. These are applied before
the global rules. All rules will always be attempted, and the
rulle will me applied to the result of the previous change.
The last defined rewrite rules will be applied first, with one major
exception: the C<PREFIXED> rules will be executed before any other
rule.
=head3 key_rewrite via table
When a HASH is provided as rule, then the XML element name is looked-up.
If found, the value is used as translated key.
First full name of the element is tried, and then the localName of
the element. The full name can be created with
L<XML::Compile::Util::pack_type()|XML::Compile::Util/"Packing"> or by hand:
use XML::Compile::Util qw/pack_type/;
my %table =
( pack_type($myns, 'el1') => 'nice_name1'
, "{$myns}el2" => 'alsoNice'
, el3 => 'in any namespace'
);
$schema->addKeyRewrite( \%table );
=head3 Rewrite via function
When a CODE reference is provided, it will get called for each key
which is found in the schema. Passed are the name-space of the
element and its local-name. Returned is the key, which may be the
local-name or something else.
For instance, some people use capitals in element names and personally
I do not like them:
sub dont_like_capitals($$)
{ my ($ns, $local) = @_;
lc $local;
}
$schema->addKeyRewrite( \&dont_like_capitals );
for short:
my $schema = XML::Compile::Schema->new( ...,
key_rewrite => sub { lc $_[1] } );
=head3 key_rewrite when localNames collide
Let's start with an appology: we cannot auto-detect when these rewrite
rules are needed, because the colliding keys are within the same HASH,
but the processing is fragmented over various (sequence) blocks: the
parser does not have the overview on which keys of the HASH are used
for which elements.
The problem occurs when one complex type or substitutionGroup contains
multiple elements with the same localName, but from different name-spaces.
In the perl representation of the data, the name-spaces get ignored
(to make the programmer's life simple) but that may cause these nasty
conflicts.
=head3 Rewrite for convenience
In XML, we often see names like C<my-elem-name>, which in Perl
would be accessed as
$h->{'my-elem-name'}
In this case, you cannot leave-out the quotes in your perl code, which is
quite inconvenient, because only 'barewords' can be used as keys unquoted.
When you use option C<key_rewrite> for L<compile()|XML::Compile::Schema/"Compilers"> or L<new()|XML::Compile::Schema/"Constructors">, you
could decide to map dashes onto underscores.
key_rewrite
=> sub { my ($ns, $local) = @_; $local =~ s/\-/_/g; $local }
key_rewrite => sub { $_[1] =~ s/\-/_/g; $_[1] }
then C<< my-elem-name >> in XML will get mapped onto C<< my_elem_name >>
in Perl, both in the READER as the WRITER. Be warned that the substitute
command returns the success, not the modified value!
=head3 Pre-defined key_rewrite rules
=over 4
=item UNDERSCORES
Replace dashes (-) with underscores (_).
=item SIMPLIFIED
Rewrite rule with the constant name (STRING) C<SIMPLIFIED> will replace
all dashes with underscores, translate capitals into lowercase, and
remove all other characters which are none-bareword (if possible, I am
too lazy to check)
=item PREFIXED
This requires a table for prefix to name-space translations, via
L<compile(prefixes)|XML::Compile::Schema/"Compilers">, which defines at least one non-empty (default)
prefix. The keys which represent elements in any name-space which has
a prefix defined will have that prefix and an underscore prepended.
Be warned that the name-spaces which you provide are used, not the
once used in the schema. Example:
my $r = $schema->compile
( READER => $type
, prefixes => [ mine => $myns ]
, key_rewrite => 'PREFIXED'
);
my $xml = $r->( <<__XML );
<data xmlns="$myns"><x>42</x></data>
__XML
print join ' => ', %$xml; # mine_x => 42
=item PREFIXED(...)
Like the previous, but now only use a selected sub-set of the available
prefixes. This is particular useful in writers, when explicit prefixes
are also used to beautify the output.
The prefixes are not checked against the prefix list, and may have
surrounding blanks.
key_rewrite => 'PREFIXED(opt,sar)'
Above is equivalent to:
key_rewrite => [ 'PREFIXED(opt)', 'PREFIXED(sar)' ]
Special care is taken that the prefix will not be added twice. For instance,
if the same prefix appears twice, or a C<PREFIXED> rule is provided as well,
then still only one prefix is added.
=back
=head1 SEE ALSO
This module is part of XML-Compile distribution version 1.47,
built on October 11, 2014. Website: F<http://perl.overmeer.net/xml-compile/>
Other distributions in this suite:
L<XML::Compile>,
L<XML::Compile::SOAP>,
L<XML::Compile::WSDL11>,
L<XML::Compile::SOAP12>,
L<XML::Compile::SOAP::Daemon>,
L<XML::Compile::SOAP::WSA>,
L<XML::Compile::C14N>,
L<XML::Compile::WSS>,
L<XML::Compile::WSS::Signature>,
L<XML::Compile::Tester>,
L<XML::Compile::Cache>,
L<XML::Compile::Dumper>,
L<XML::Compile::RPC>,
L<XML::Rewrite>
and
L<XML::LibXML::Simple>.
Please post questions or ideas to the mailinglist at
F<http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/xml-compile> .
For live contact with other developers, visit the C<#xml-compile> channel
on C<irc.perl.org>.
=head1 LICENSE
Copyrights 2006-2014 by [Mark Overmeer]. For other contributors see ChangeLog.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
See F<http://www.perl.com/perl/misc/Artistic.html>
|