<!ELEMENT Bookstore (Book | Magazine)*> <!ELEMENT Book (Title, Authors, Remark?)> <!ATTLIST Book ISBN CDATA #REQUIRED Price CDATA #REQUIRED Edition CDATA #IMPLIED> <!ELEMENT Magazine (Title)> <!ATTLIST Magazine Month CDATA #REQUIRED Year CDATA #REQUIRED> <!ELEMENT Title (#PCDATA)> <!ELEMENT Authors (Author+)> <!ELEMENT Remark (#PCDATA)> <!ELEMENT Author (First_Name, Last_Name)> <!ELEMENT First_Name (#PCDATA)> <!ELEMENT Last_Name (#PCDATA)> <?xml version="1.0" standalone="no"?> <!DOCTYPE Bookstore SYSTEM "bookstore.dtd"> <Bookstore> <Book ISBN="ISBN-0-13-035300-0" Price="$65" Edition="2nd"> <Title>A First Course in Database Systems</Title> <Authors> <Author> <First_Name>Jeffrey</First_Name> <Last_Name>Ullman</Last_Name> </Author> <Author> <First_Name>Jennifer</First_Name> <Last_Name>Widom</Last_Name> </Author> </Authors> </Book> <Book ISBN="ISBN-0-13-031995-3" Price="$75"> <Title>Database Systems: The Complete Book</Title> <Authors> <Author> <First_Name>Hector</First_Name> <Last_Name>Garcia-Molina</Last_Name> </Author> <Author> <First_Name>Jeffrey</First_Name> <Last_Name>Ullman</Last_Name> </Author> <Author> <First_Name>Jennifer</First_Name> <Last_Name>Widom</Last_Name> </Author> </Authors> <Remark> Amazon.com says: Buy this book bundled with "A First Course," it's a great deal! </Remark> </Book> </Bookstore>
XPath specifies path expressions that match XML data by navigating down (and occasionally up or across) the tree.
Basic constructs (very incomplete list):
/ | root element, or separator between steps in path |
* | matches any one element name |
@X | matches attribute X of the current element |
// | matches any descendant of the current element |
[C] | evaluates condition C on the current element |
[N] | picks the Nth matching element |
contains(s1,s2) | returns TRUE if string s1 contains string s2 |
name() | returns tag of the current element |
parent:: | matches the parent of the current element |
following-sibling:: | matches all siblings after the current node |
descendants:: | matches any descendant of the current element |
self:: | matches the current element |
(Example: all book titles)
(Example: all book or magazine titles)
(Example: all ISBN numbers)
(Example: all books costing < $70)
(Example: all ISBN numbers of books costing < $70)
(Example: all books containing a remark)
(Example: all titles of books costing < $70 where "Ullman" is an author)
(Example: same query using //)
(Example: all second authors anywhere)
(Example: all author last names anywhere)
(Example: all books whose title contains one of its author's last names)
(Example: all magazines where there is a book of the same title)
(Example: all books where there is a different book of the same title)
(Example: all elements whose parent tag is not "Book")
For next example modify DTD to contain Remark* instead of Remark?
(Example: all books where a Remark includes "great")
(Example: all books where all Remarks include "great")
... Document d = parser.getDocument(); int numWords = countWordsInNode(d); ... public static int countWordsInNode(Node node) { int numWords = 0; if (node.hasChildNodes()) { NodeList children = node.getChildNodes(); for (int i = 0; i < children.getLength(); i++) { numWords += countWordsInNode(children.item(i)); } } int type = node.getNodeType(); if (type == Node.TEXT_NODE) { String s = node.getNodeValue(); numWords += countWordsInString(s); } return numWords; }