DOM Parser Introduction
Document Object Model (DOM) API for XML approach is memory intensive compared to the SAX parser. Refer SAX Parser for an example implementation of SAX parser.
If XML content size is large it is recommended to use the SAX parser approach. In the DOM parsing approach we load the entire contents of an XML file into a tree structure and then iterate through the tree to read the content. Typically when we need to modify the XML documents DOM parser would be advantageous.
A sample implementation of DOM parser is listed below. Here we read the XML file and create a Document object in memory. Then we iterate through the tree and extract the required elements/ attributes. It is a typical practice to use a POJO to store the contents for application use.
Simple Java program implementation of DOM parser
This is the input XML file we are interested in parsing. We need the attribute ID and the elements TITLE and ARTIST.<CATALOG> <CD id="1"> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> <CD id="2"> <TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tyler</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR> </CD> <CD id="3"> <TITLE>Greatest Hits</TITLE> <ARTIST>Dolly Parton</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>RCA</COMPANY> <PRICE>9.90</PRICE> <YEAR>1982</YEAR> </CD> <CD id="4"> <TITLE>Still got the blues</TITLE> <ARTIST>Gary Moore</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>Virgin records</COMPANY> <PRICE>10.20</PRICE> <YEAR>1990</YEAR> </CD> </CATALOG>We create a POJO to store the contents for application use with the required data.
package com.sourcetricks.MyDomParser; public class CD { private String id; private String title; private String artist; public String getId() { return id; } public void setId(String id) { this.id = id; } public String getTitle() { return title; } public void setTitle(String title) { this.title = title; } public String getArtist() { return artist; } public void setArtist(String artist) { this.artist = artist; } public void print() { System.out.println("ID = " + id); System.out.println("Title = " + title); System.out.println("Artist = " + artist); } }This is the main application program to read the XML file contents, parse and iterate to read the required content. Finally we print the POJO's.
package com.sourcetricks.MyDomParser; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.util.ArrayList; import java.util.List; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.NodeList; import org.w3c.dom.Node; public class MyDomParser { public static void main(String[] args) { List<CD> cdList = new ArrayList<CD>(); try { // Setup the parser DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = builderFactory.newDocumentBuilder(); // Read the XML file File inputFile = new File("resources/input-data.xml"); InputStream inputStream = new FileInputStream(inputFile); // Parse the XML file Document doc = builder.parse(inputStream); // Get all CD elements NodeList cdElements = doc.getElementsByTagName("CD"); for ( int i = 0; i < cdElements.getLength(); i++ ) { Node currentNode = cdElements.item(i); // Seen the CD tag if ( currentNode instanceof Element ) { // Store in a pojo CD cd = new CD(); // Read attribute of CD element cd.setId(((Element) currentNode).getAttribute("id")); // Child elements under CD NodeList childNodes = currentNode.getChildNodes(); for ( int j = 0; j < childNodes.getLength(); j++ ) { Node childNode = childNodes.item(j); if ( childNode instanceof Element ) { if ( childNode.getNodeName().equalsIgnoreCase("title") ) { cd.setTitle(childNode.getTextContent()); } else if ( childNode.getNodeName().equalsIgnoreCase("artist") ) { cd.setArtist(childNode.getTextContent()); } // Include other elements as needed } } cdList.add(cd); } } } catch (Exception e) { e.printStackTrace(); } // Print contents of CD list for ( CD c : cdList ) { c.print(); } } }This is output.
ID = 1 Title = Empire Burlesque Artist = Bob Dylan ID = 2 Title = Hide your heart Artist = Bonnie Tyler ID = 3 Title = Greatest Hits Artist = Dolly Parton ID = 4 Title = Still got the blues Artist = Gary Moore
0 comments:
Post a Comment