10.31 Parsing and Generating XML—library(xml)

This is a package for parsing XML with Prolog, which provides Prolog applications with a simple “Document Value Model” interface to XML documents. A description of the subset of XML that it supports can be found at: http://homepages.tesco.net/binding-time/xml.pl.html

The package, originally written by Binding Time Ltd., is in the public domain and unsupported. To use the package, enter the query:

The package represents XML documents by the abstract data type document, which is defined by the following grammar:

document ::= xml(attributes,content) { well-formed document }
| malformed(attributes,content) { malformed document }

attributes ::= []
| [name=char-data|attributes]

content ::= []
| [cterm|content]

cterm ::= pcdata(char-data) { text }
| comment(char-data) { an XML comment }
| namespace(URI,prefix,element) { a Namespace }
| element(tagattributes,content) { <tag>..</tag> encloses content or <tag /> if empty }
| instructions(name,char-data) { A PI <? name char-data ?> }
| cdata(char-data) { <![CDATA[char-data]]> }
| doctype(tag,doctype-id) { DTD <!DOCTYPE .. > }
| unparsed(char-data) { text that hasn't been parsed }
| out_of_context(tag) { tag is not closed }

tag ::= atom { naming an element }

name ::= atom { not naming an element }

URI ::= atom { giving the URI of a namespace }

char-data ::= code-list

doctype-id ::= public(char-data,char-data)
| public(char-data,dtd-literals)
| system(char-data)
| system(char-data,dtd-literals)
| local
| local,dtd-literals

dtd-literals ::= []
| [dtd_literal(char-data)|dtd-literals]

The following predicates are exported by the package:

xml_parse(?Chars, ?Document)
xml_parse(?Chars, ?Document, +Options)
Either parses Chars, a code-list, to Document, a document. Chars is not required to represent strictly well-formed XML. Or generates Chars, a code-list, from Document, a document. If Document is not a valid document term representing well-formed XML, an exception is raised. In the second usage of the predicate, the only option available is format/1.

Options is a list of zero or more of the following, where Boolean must be true or false:

format(Boolean)
Indent the element content (default true).
extended_characters(Boolean)
Use the extended character entities for XHTML (default true).
remove_attribute_prefixes(Boolean)
Remove namespace prefixes from attributes when it's the same as the prefix of the parent element (default false).

xml_subterm(+Term, ?Subterm)
Unifies Subterm with a sub-term of Term, a document. This can be especially useful when trying to test or retrieve a deeply-nested subterm from a document.
xml_pp(+Document)
“Pretty prints” Document, a document, on the current output stream.

Send feedback on this subject.