Warning: This module is considered out-dated and not up to Phobos' current standards. It will remain until we have a suitable replacement, but be aware that it will not remain long term.
Classes and functions for creating and parsing XML
The basic architecture of this module is that there are standalone functions,
classes for constructing an XML document from scratch (Tag, Element and
Document), and also classes for parsing a pre-existing XML file (ElementParser
and DocumentParser). The parsing classes may be used to build a
Document, but that is not their primary purpose. The handling capabilities of
DocumentParser and ElementParser are sufficiently customizable that you can
make them do pretty much whatever you want.
import std.xml; import std.stdio; import std.string; import std.file; // books.xml is used in various samples throughout the Microsoft XML Core // Services (MSXML) SDK. // // See http://msdn2.microsoft.com/en-us/library/ms762271(VS.85).aspx void main() { string s = cast(string)std.file.read("books.xml"); // Check for well-formedness check(s); // Make a DOM tree auto doc = new Document(s); // Plain-print it writeln(doc); }
import std.xml; import std.stdio; import std.string; struct Book { string id; string author; string title; string genre; string price; string pubDate; string description; } void main() { string s = cast(string)std.file.read("books.xml"); // Check for well-formedness check(s); // Take it apart Book[] books; auto xml = new DocumentParser(s); xml.onStartTag["book"] = (ElementParser xml) { Book book; book.id = xml.tag.attr["id"]; xml.onEndTag["author"] = (in Element e) { book.author = e.text(); }; xml.onEndTag["title"] = (in Element e) { book.title = e.text(); }; xml.onEndTag["genre"] = (in Element e) { book.genre = e.text(); }; xml.onEndTag["price"] = (in Element e) { book.price = e.text(); }; xml.onEndTag["publish-date"] = (in Element e) { book.pubDate = e.text(); }; xml.onEndTag["description"] = (in Element e) { book.description = e.text(); }; xml.parse(); books ~= book; }; xml.parse(); // Put it back together again; auto doc = new Document(new Tag("catalog")); foreach(book;books) { auto element = new Element("book"); element.tag.attr["id"] = book.id; element ~= new Element("author", book.author); element ~= new Element("title", book.title); element ~= new Element("genre", book.genre); element ~= new Element("price", book.price); element ~= new Element("publish-date",book.pubDate); element ~= new Element("description", book.description); doc ~= element; } // Pretty-print it writefln(join(doc.pretty(3),"\n")); }
Returns true if the character is a character according to the XML standard
dchar c | the character to be tested |
Returns true if the character is whitespace according to the XML standard
Only the following characters are considered whitespace in XML - space, tab, carriage return and linefeed
dchar c | the character to be tested |
Returns true if the character is a digit according to the XML standard
dchar c | the character to be tested |
Returns true if the character is a letter according to the XML standard
dchar c | the character to be tested |
Returns true if the character is an ideographic character according to the XML standard
dchar c | the character to be tested |
Returns true if the character is a base character according to the XML standard
dchar c | the character to be tested |
Returns true if the character is a combining character according to the XML standard
dchar c | the character to be tested |
Returns true if the character is an extender according to the XML standard
dchar c | the character to be tested |
Encodes a string by replacing all characters which need to be escaped with appropriate predefined XML entities.
encode() escapes certain characters (ampersand, quote, apostrophe, less-than
and greater-than), and similarly, decode() unescapes them. These functions
are provided for convenience only. You do not need to use them when using
the std.xml classes, because then all the encoding and decoding will be done
for you automatically.
If the string is not modified, the original will be returned.
S s | The string to be encoded |
writefln(encode("a > b")); // writes "a > b"
Mode to use for decoding.
Decodes a string by unescaping all predefined XML entities.
encode() escapes certain characters (ampersand, quote, apostrophe, less-than
and greater-than), and similarly, decode() unescapes them. These functions
are provided for convenience only. You do not need to use them when using
the std.xml classes, because then all the encoding and decoding will be done
for you automatically.
This function decodes the entities &, ", ',
< and >,
as well as decimal and hexadecimal entities such as €
If the string does not contain an ampersand, the original will be returned.
Note that the "mode" parameter can be one of DecodeMode.NONE (do not
decode), DecodeMode.LOOSE ( decode, but ignore errors), or DecodeMode.STRICT
( decode, and throw a DecodeException in the event of an error).
string s | The string to be decoded |
DecodeMode mode | (optional) Mode to use for decoding. (Defaults to LOOSE). |
writefln(decode("a > b")); // writes "a > b"
Class representing an XML document.
Contains all text which occurs before the root element. Defaults to <?xml version="1.0"?>
Contains all text which occurs after the root element. Defaults to the empty string
Constructs a Document by parsing XML text.
This function creates a complete DOM (Document Object Model) tree.
The input to this function MUST be valid XML.
This is enforced by DocumentParser's in contract.
string s | the complete XML text. |
Constructs a Document from a Tag.
const(Tag) tag | the start tag of the document. |
Compares two Documents for equality
Document d1,d2;
if (d1 == d2) { }
Compares two Documents
You should rarely need to call this function. It exists so that Documents can be used as associative array keys.
Document d1,d2;
if (d1 < d2) { }
Returns the hash of a Document
You should rarely need to call this function. It exists so that Documents can be used as associative array keys.
Returns the string representation of a Document. (That is, the complete XML of a document).
Class representing an XML element.
The start tag of the element
The element's items
The element's text items
The element's CData items
The element's comments
The element's processing instructions
The element's child elements
Constructs an Element given a name and a string to be used as a Text interior.
string name | the name of the element. |
string interior | (optional) the string interior. |
auto element = new Element("title","Serenity") // constructs the element <title>Serenity</title>
Constructs an Element from a Tag.
const(Tag) tag_ | the start or empty tag of the element. |
Append a text item to the interior of this element
Text item | the item you wish to append. |
Element element; element ~= new Text("hello");
Append a CData item to the interior of this element
CData item | the item you wish to append. |
Element element; element ~= new CData("hello");
Append a comment to the interior of this element
Comment item | the item you wish to append. |
Element element; element ~= new Comment("hello");
Append a processing instruction to the interior of this element
ProcessingInstruction item | the item you wish to append. |
Element element; element ~= new ProcessingInstruction("hello");
Append a complete element to the interior of this element
Element item | the item you wish to append. |
Element element; Element other = new Element("br"); element ~= other; // appends element representing <br />
Compares two Elements for equality
Element e1,e2;
if (e1 == e2) { }
Compares two Elements
You should rarely need to call this function. It exists so that Elements can be used as associative array keys.
Element e1,e2;
if (e1 < e2) { }
Returns the hash of an Element
You should rarely need to call this function. It exists so that Elements can be used as associative array keys.
Returns the decoded interior of an element.
The element is assumed to containt text only. So, for example, given XML such as "<title>Good & Bad</title>", will return "Good & Bad".
DecodeMode mode | (optional) Mode to use for decoding. (Defaults to LOOSE). |
Returns an indented string representation of this item
uint indent | (optional) number of spaces by which to indent this element. Defaults to 2. |
Returns the string representation of an Element
auto element = new Element("br"); writefln(element.toString()); // writes "<br />"
Tag types.
Class representing an XML tag.
Type of tag
Tag name
Associative array of attributes
Constructs an instance of Tag with a specified name and type
The constructor does not initialize the attributes. To initialize the attributes, you access the attr member variable.
string name | the Tag's name |
TagType type | (optional) the Tag's type. If omitted, defaults to TagType.START. |
auto tag = new Tag("img",Tag.EMPTY); tag.attr["src"] = "http://example.com/example.jpg";
Compares two Tags for equality
You should rarely need to call this function. It exists so that Tags can be used as associative array keys.
Tag tag1,tag2
if (tag1 == tag2) { }
Compares two Tags
Tag tag1,tag2
if (tag1 < tag2) { }
Returns the hash of a Tag
You should rarely need to call this function. It exists so that Tags can be used as associative array keys.
Returns the string representation of a Tag
auto tag = new Tag("book",TagType.START); writefln(tag.toString()); // writes "<book>"
Returns true if the Tag is a start tag
if (tag.isStart) { }
Returns true if the Tag is an end tag
if (tag.isEnd) { }
Returns true if the Tag is an empty tag
if (tag.isEmpty) { }
Class representing a comment
Construct a comment
string content | the body of the comment |
auto item = new Comment("This is a comment"); // constructs <!--This is a comment-->
Compares two comments for equality
Comment item1,item2;
if (item1 == item2) { }
Compares two comments
You should rarely need to call this function. It exists so that Comments can be used as associative array keys.
Comment item1,item2;
if (item1 < item2) { }
Returns the hash of a Comment
You should rarely need to call this function. It exists so that Comments can be used as associative array keys.
Returns a string representation of this comment
Returns false always
Class representing a Character Data section
Construct a chraracter data section
string content | the body of the character data segment |
auto item = new CData("<b>hello</b>"); // constructs <![CDATA[<b>hello</b>]]>
Compares two CDatas for equality
CData item1,item2;
if (item1 == item2) { }
Compares two CDatas
You should rarely need to call this function. It exists so that CDatas can be used as associative array keys.
CData item1,item2;
if (item1 < item2) { }
Returns the hash of a CData
You should rarely need to call this function. It exists so that CDatas can be used as associative array keys.
Returns a string representation of this CData section
Returns false always
Class representing a text (aka Parsed Character Data) section
Construct a text (aka PCData) section
string content | the text. This function encodes the text before insertion, so it is safe to insert any text |
auto Text = new CData("a < b"); // constructs a < b
Compares two text sections for equality
Text item1,item2;
if (item1 == item2) { }
Compares two text sections
You should rarely need to call this function. It exists so that Texts can be used as associative array keys.
Text item1,item2;
if (item1 < item2) { }
Returns the hash of a text section
You should rarely need to call this function. It exists so that Texts can be used as associative array keys.
Returns a string representation of this Text section
Returns true if the content is the empty string
Class representing an XML Instruction section
Construct an XML Instruction section
string content | the body of the instruction segment |
auto item = new XMLInstruction("ATTLIST"); // constructs <!ATTLIST>
Compares two XML instructions for equality
XMLInstruction item1,item2;
if (item1 == item2) { }
Compares two XML instructions
You should rarely need to call this function. It exists so that XmlInstructions can be used as associative array keys.
XMLInstruction item1,item2;
if (item1 < item2) { }
Returns the hash of an XMLInstruction
You should rarely need to call this function. It exists so that XmlInstructions can be used as associative array keys.
Returns a string representation of this XmlInstruction
Returns false always
Class representing a Processing Instruction section
Construct a Processing Instruction section
string content | the body of the instruction segment |
auto item = new ProcessingInstruction("php"); // constructs <?php?>
Compares two processing instructions for equality
ProcessingInstruction item1,item2;
if (item1 == item2) { }
Compares two processing instructions
You should rarely need to call this function. It exists so that ProcessingInstructions can be used as associative array keys.
ProcessingInstruction item1,item2;
if (item1 < item2) { }
Returns the hash of a ProcessingInstruction
You should rarely need to call this function. It exists so that ProcessingInstructions can be used as associative array keys.
Returns a string representation of this ProcessingInstruction
Returns false always
Abstract base class for XML items
Compares with another Item of same type for equality
Compares with another Item of same type
Returns the hash of this item
Returns a string representation of this item
Returns an indented string representation of this item
uint indent | number of spaces by which to indent child elements |
Returns true if the item represents empty XML text
Class for parsing an XML Document.
This is a subclass of ElementParser. Most of the useful functions are documented there.
Constructs a DocumentParser.
The input to this function MUST be valid XML. This is enforced by the function's in contract.
string xmlText_ | the entire XML document as text |
Class for parsing an XML element.
The Tag at the start of the element being parsed. You can read this to determine the tag's name and attributes.
Register a handler which will be called whenever a start tag is encountered which matches the specified name. You can also pass null as the name, in which case the handler will be called for any unmatched start tag.
// Call this function whenever a <podcast> start tag is encountered onStartTag["podcast"] = (ElementParser xml) { // Your code here // // This is a a closure, so code here may reference // variables which are outside of this scope }; // call myEpisodeStartHandler (defined elsewhere) whenever an <episode> // start tag is encountered onStartTag["episode"] = &myEpisodeStartHandler; // call delegate dg for all other start tags onStartTag[null] = dg;
Register a handler which will be called whenever an end tag is encountered which matches the specified name. You can also pass null as the name, in which case the handler will be called for any unmatched end tag.
// Call this function whenever a </podcast> end tag is encountered onEndTag["podcast"] = (in Element e) { // Your code here // // This is a a closure, so code here may reference // variables which are outside of this scope }; // call myEpisodeEndHandler (defined elsewhere) whenever an </episode> // end tag is encountered onEndTag["episode"] = &myEpisodeEndHandler; // call delegate dg for all other end tags onEndTag[null] = dg;
Register a handler which will be called whenever text is encountered.
// Call this function whenever text is encountered onText = (string s) { // Your code here // The passed parameter s will have been decoded by the time you see // it, and so may contain any character. // // This is a a closure, so code here may reference // variables which are outside of this scope };
Register an alternative handler which will be called whenever text is encountered. This differs from onText in that onText will decode the text, wheras onTextRaw will not. This allows you to make design choices, since onText will be more accurate, but slower, while onTextRaw will be faster, but less accurate. Of course, you can still call decode() within your handler, if you want, but you'd probably want to use onTextRaw only in circumstances where you know that decoding is unnecessary.
// Call this function whenever text is encountered onText = (string s) { // Your code here // The passed parameter s will NOT have been decoded. // // This is a a closure, so code here may reference // variables which are outside of this scope };
Register a handler which will be called whenever a character data segement is encountered.
// Call this function whenever a CData section is encountered onCData = (string s) { // Your code here // The passed parameter s does not include the opening <![CDATA[ // nor closing ]]> // // This is a a closure, so code here may reference // variables which are outside of this scope };
Register a handler which will be called whenever a comment is encountered.
// Call this function whenever a comment is encountered onComment = (string s) { // Your code here // The passed parameter s does not include the opening <!-- nor // closing --> // // This is a a closure, so code here may reference // variables which are outside of this scope };
Register a handler which will be called whenever a processing instruction is encountered.
// Call this function whenever a processing instruction is encountered onPI = (string s) { // Your code here // The passed parameter s does not include the opening <? nor // closing ?> // // This is a a closure, so code here may reference // variables which are outside of this scope };
Register a handler which will be called whenever an XML instruction is encountered.
// Call this function whenever an XML instruction is encountered // (Note: XML instructions may only occur preceeding the root tag of a // document). onPI = (string s) { // Your code here // The passed parameter s does not include the opening <! nor // closing > // // This is a a closure, so code here may reference // variables which are outside of this scope };
Parse an XML element.
Parsing will continue until the end of the current element. Any items encountered for which a handler has been registered will invoke that handler.
Returns that part of the element which has already been parsed
Check an entire XML document for well-formedness
string s | the document to be checked, passed as a string |
The base class for exceptions thrown by this module
Thrown during Comment constructor
Thrown during CData constructor
Thrown during XMLInstruction constructor
Thrown during ProcessingInstruction constructor
Thrown during Text constructor
Thrown during decode()
Thrown if comparing with wrong type
Thrown when parsing for Tags