XML parsing using SaxParser with complete code

SAX parser use callback function (org.xml.sax.helpers.DefaultHandler) to informs clients of the XML document structure. You should extend DefaultHandler and override few methods to achieve xml parsing.
The methods to override are
  • startDocument() and endDocument() – Method called at the start and end of an XML document. 
  • startElement() and endElement() – Method called at the start and end of a document element.  
  • characters() – Method called with the text contents in between the start and end tags of an XML document element.

The following example demonstrates the uses of DefaultHandler to parse and XML document. It performs mapping of xml to model class and generate list of objects.


Sample XML Document :

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
    <book id="001" lang="ENG">
        <isbn>23-34-42-3</isbn>
        <regDate>1990-05-24</regDate>
        <title>Operating Systems</title>
        <publisher country="USA">Pearson</publisher>
        <price>400</price>
        <authors>
            <author>Ganesh Tiwari</author>
        </authors>
    </book>
    <book id="002">
        <isbn>24-300-042-3</isbn>
        <regDate>1995-05-12</regDate>
        <title>Distributed Systems</title>
        <publisher country="Nepal">Ekata</publisher>
        <price>500</price>
        <authors>
            <author>Mahesh Poudel</author>
            <author>Bikram Adhikari</author>
            <author>Ramesh Poudel</author>
        </authors>
        </book>
</catalog>



Model Class for Book Object for Mapping xml to object
/**
 * Book class stores book information, after parsing the xml
 * @author Ganesh Tiwari
 */
public class Book {
    String lang;
    String title;
    String id;
    String isbn;
    Date regDate;
    String publisher;
    int price;
    List<String> authors;
    public Book(){
        authors=new ArrayList<String>();
    }
    //getters and setters
}

Java Code for XML Parsing (Sax) :

import java.io.IOException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class MySaxParser extends DefaultHandler {
    List<Book> bookL;
    String bookXmlFileName;
    String tmpValue;
    Book bookTmp;
    SimpleDateFormat sdf= new SimpleDateFormat("yy-MM-dd");
    public MySaxParser(String bookXmlFileName) {
        this.bookXmlFileName = bookXmlFileName;
        bookL = new ArrayList<Book>();
        parseDocument();
        printDatas();
    }
    private void parseDocument() {
        // parse
        SAXParserFactory factory = SAXParserFactory.newInstance();
        try {
            SAXParser parser = factory.newSAXParser();
            parser.parse(bookXmlFileName, this);
        } catch (ParserConfigurationException e) {
            System.out.println("ParserConfig error");
        } catch (SAXException e) {
            System.out.println("SAXException : xml not well formed");
        } catch (IOException e) {
            System.out.println("IO error");
        }
    }
    private void printDatas() {
       // System.out.println(bookL.size());
        for (Book tmpB : bookL) {
            System.out.println(tmpB.toString());
        }
    }
    @Override
    public void startElement(String s, String s1, String elementName, Attributes attributes) throws SAXException {
        // if current element is book , create new book
        // clear tmpValue on start of element

        if (elementName.equalsIgnoreCase("book")) {
            bookTmp = new Book();
            bookTmp.setId(attributes.getValue("id"));
            bookTmp.setLang(attributes.getValue("lang"));
        }
        // if current element is publisher
        if (elementName.equalsIgnoreCase("publisher")) {
            bookTmp.setPublisher(attributes.getValue("country"));
        }
    }
    @Override
    public void endElement(String s, String s1, String element) throws SAXException {
        // if end of book element add to list
        if (element.equals("book")) {
            bookL.add(bookTmp);
        }
        if (element.equalsIgnoreCase("isbn")) {
            bookTmp.setIsbn(tmpValue);
        }
        if (element.equalsIgnoreCase("title")) {
            bookTmp.setTitle(tmpValue);
        }
        if(element.equalsIgnoreCase("author")){
           bookTmp.getAuthors().add(tmpValue);
        }
        if(element.equalsIgnoreCase("price")){
            bookTmp.setPrice(Integer.parseInt(tmpValue));
        }
        if(element.equalsIgnoreCase("regDate")){
            try {
                bookTmp.setRegDate(sdf.parse(tmpValue));
            } catch (ParseException e) {
                System.out.println("date parsing error");
            }
        }
    }
    @Override
    public void characters(char[] ac, int i, int j) throws SAXException {
        tmpValue = new String(ac, i, j);
    }
    public static void main(String[] args) {
        new MySaxParser("catalog.xml");
    }
}


Output of Parsing :

Book [lang=ENG, title=Operating Systems, id=001, isbn=23-34-42-3, regDate=Thu May 24 00:00:00 NPT 1990, publisher=USA, price=400, authors=[Ganesh Tiwari]]
Book [lang=null, title=Distributed Systems, id=002, isbn=24-300-042-3, regDate=Fri May 12 00:00:00 NPT 1995, publisher=Nepal, price=500, authors=[Mahesh Poudel, Bikram Adhikari, Ramesh Poudel]]


6 comments:

  1. Thank you very much for this complete tutorial with code. Helped me a lot.

    ReplyDelete
  2. i am not getting the output..am using eclipse ide so where must we use the xml file in order to get executed

    ReplyDelete
    Replies
    1. According to code -
      --> new MySaxParser("catalog.xml");

      So, you can simply save this file in the project directory.

      Delete
  3. no output.., its crashed when executed...I realy need good and complete tutorial to parse xml or plist file in android app...
    waiting for response.

    ReplyDelete
  4. Thank you for such a complete example that uses a realistic XML file for input!!

    ReplyDelete
  5. I want to set the value "Pearson" also in publisher. how do i get the value Pearson?

    ReplyDelete

Your Comment and Question will help to make this blog better...