Package org.dom4j.io

Class SAXReader


  • public class SAXReader
    extends java.lang.Object
    SAXReader creates a DOM4J tree from SAX parsing events.

    The actual SAX parser that is used by this class is configurable so you can use your favourite SAX parser if you wish. DOM4J comes configured with its own SAX parser so you do not need to worry about configuring the SAX parser.

    To explicitly configure the SAX parser that is used via Java code you can use a constructor or use the setXMLReader(XMLReader)or setXMLReaderClassName(String) methods.

    If the parser is not specified explicitly then the standard SAX policy of using the org.xml.sax.driver system property is used to determine the implementation class of XMLReader.

    If the org.xml.sax.driver system property is not defined then JAXP is used via reflection (so that DOM4J is not explicitly dependent on the JAXP classes) to load the JAXP configured SAXParser. If there is any error creating a JAXP SAXParser an informational message is output and then the default (Aelfred) SAX parser is used instead.

    If you are trying to use JAXP to explicitly set your SAX parser and are experiencing problems, you can turn on verbose error reporting by defining the system property org.dom4j.verbose to be "true" which will output a more detailed description of why JAXP could not find a SAX parser

    For more information on JAXP please go to Sun's Java & XML site

    • Constructor Summary

      Constructors 
      Constructor Description
      SAXReader()
      This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader().
      SAXReader​(boolean validating)
      This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader().
      SAXReader​(java.lang.String xmlReaderClassName)  
      SAXReader​(java.lang.String xmlReaderClassName, boolean validating)  
      SAXReader​(DocumentFactory factory)
      This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader().
      SAXReader​(DocumentFactory factory, boolean validating)
      This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader().
      SAXReader​(org.xml.sax.XMLReader xmlReader)  
      SAXReader​(org.xml.sax.XMLReader xmlReader, boolean validating)  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void addHandler​(java.lang.String path, ElementHandler handler)
      Adds the ElementHandler to be called when the specified path is encounted.
      protected void configureReader​(org.xml.sax.XMLReader reader, org.xml.sax.helpers.DefaultHandler handler)
      Configures the XMLReader before use
      protected SAXContentHandler createContentHandler​(org.xml.sax.XMLReader reader)
      Factory Method to allow user derived SAXContentHandler objects to be used
      static SAXReader createDefault()  
      protected org.xml.sax.EntityResolver createDefaultEntityResolver​(java.lang.String systemId)  
      protected org.xml.sax.XMLReader createXMLReader()
      Factory Method to allow alternate methods of creating and configuring XMLReader objects
      protected org.dom4j.io.DispatchHandler getDispatchHandler()  
      DocumentFactory getDocumentFactory()
      DOCUMENT ME!
      java.lang.String getEncoding()
      Returns encoding used for InputSource (null means system default encoding)
      org.xml.sax.EntityResolver getEntityResolver()
      Returns the current entity resolver used to resolve entities
      org.xml.sax.ErrorHandler getErrorHandler()
      DOCUMENT ME!
      org.xml.sax.XMLFilter getXMLFilter()
      Returns the SAX filter being used to filter SAX events.
      org.xml.sax.XMLReader getXMLReader()
      DOCUMENT ME!
      protected org.xml.sax.XMLReader installXMLFilter​(org.xml.sax.XMLReader reader)
      Installs any XMLFilter objects required to allow the SAX event stream to be filtered and preprocessed before it gets to dom4j.
      boolean isIgnoreComments()
      Returns whether we should ignore comments or not.
      boolean isIncludeExternalDTDDeclarations()
      DOCUMENT ME!
      boolean isIncludeInternalDTDDeclarations()
      DOCUMENT ME!
      boolean isMergeAdjacentText()
      Returns whether adjacent text nodes should be merged together.
      boolean isStringInternEnabled()
      Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs.
      boolean isStripWhitespaceText()
      Sets whether whitespace between element start and end tags should be ignored
      boolean isValidating()
      DOCUMENT ME!
      Document read​(java.io.File file)
      Reads a Document from the given File
      Document read​(java.io.InputStream in)
      Reads a Document from the given stream using SAX
      Document read​(java.io.InputStream in, java.lang.String systemId)
      Reads a Document from the given stream using SAX
      Document read​(java.io.Reader reader)
      Reads a Document from the given Reader using SAX
      Document read​(java.io.Reader reader, java.lang.String systemId)
      Reads a Document from the given Reader using SAX
      Document read​(java.lang.String systemId)
      Reads a Document from the given URL or filename using SAX.
      Document read​(java.net.URL url)
      Reads a Document from the given URL using SAX
      Document read​(org.xml.sax.InputSource in)
      Reads a Document from the given InputSource using SAX
      void removeHandler​(java.lang.String path)
      Removes the ElementHandler from the event based processor, for the specified path.
      void resetHandlers()
      This method clears out all the existing handlers and default handler setting things back as if no handler existed.
      void setDefaultHandler​(ElementHandler handler)
      When multiple ElementHandler instances have been registered, this will set a default ElementHandler to be called for any path which does NOT have a handler registered.
      protected void setDispatchHandler​(org.dom4j.io.DispatchHandler dispatchHandler)  
      void setDocumentFactory​(DocumentFactory documentFactory)
      This sets the DocumentFactory used to create new documents.
      void setEncoding​(java.lang.String encoding)
      Sets encoding used for InputSource (null means system default encoding)
      void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)
      Sets the entity resolver used to resolve entities.
      void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)
      Sets the ErrorHandler used by the SAX XMLReader.
      void setFeature​(java.lang.String name, boolean value)
      Sets a SAX feature on the underlying SAX parser.
      void setIgnoreComments​(boolean ignoreComments)
      Sets whether we should ignore comments or not.
      void setIncludeExternalDTDDeclarations​(boolean include)
      Sets whether DTD external declarations should be expanded into the DocumentType object or not.
      void setIncludeInternalDTDDeclarations​(boolean include)
      Sets whether internal DTD declarations should be expanded into the DocumentType object or not.
      void setMergeAdjacentText​(boolean mergeAdjacentText)
      Sets whether or not adjacent text nodes should be merged together when parsing.
      void setProperty​(java.lang.String name, java.lang.Object value)
      Allows a SAX property to be set on the underlying SAX parser.
      void setStringInternEnabled​(boolean stringInternEnabled)
      Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs
      void setStripWhitespaceText​(boolean stripWhitespaceText)
      Sets whether whitespace between element start and end tags should be ignored.
      void setValidation​(boolean validation)
      Sets the validation mode.
      void setXMLFilter​(org.xml.sax.XMLFilter filter)
      Sets the SAX filter to be used when filtering SAX events
      void setXMLReader​(org.xml.sax.XMLReader reader)
      Sets the XMLReader used to parse SAX events
      void setXMLReaderClassName​(java.lang.String xmlReaderClassName)
      Sets the class name of the XMLReader to be used to parse SAX events.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • SAXReader

        public SAXReader()
        This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:
         reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
         reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
         reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
         
      • SAXReader

        public SAXReader​(boolean validating)
        This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:
         reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
         reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
         reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
         
        Parameters:
        validating -
      • SAXReader

        public SAXReader​(DocumentFactory factory)
        This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:
         reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
         reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
         reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
         
        Parameters:
        factory -
      • SAXReader

        public SAXReader​(DocumentFactory factory,
                         boolean validating)
        This method internally calls SAXParserFactory.newInstance().newSAXParser().getXMLReader() or XMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:
         reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
         reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
         reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
         
        Parameters:
        factory -
        validating -
      • SAXReader

        public SAXReader​(org.xml.sax.XMLReader xmlReader)
      • SAXReader

        public SAXReader​(org.xml.sax.XMLReader xmlReader,
                         boolean validating)
      • SAXReader

        public SAXReader​(java.lang.String xmlReaderClassName)
                  throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
      • SAXReader

        public SAXReader​(java.lang.String xmlReaderClassName,
                         boolean validating)
                  throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
    • Method Detail

      • createDefault

        public static SAXReader createDefault()
      • setProperty

        public void setProperty​(java.lang.String name,
                                java.lang.Object value)
                         throws org.xml.sax.SAXException
        Allows a SAX property to be set on the underlying SAX parser. This can be useful to set parser-specific properties such as the location of schema or DTD resources. Though use this method with caution as it has the possibility of breaking the standard behaviour. An alternative to calling this method is to correctly configure an XMLReader object instance and call the setXMLReader(XMLReader)method
        Parameters:
        name - is the SAX property name
        value - is the value of the SAX property
        Throws:
        org.xml.sax.SAXException - if the XMLReader could not be created or the property could not be changed.
      • setFeature

        public void setFeature​(java.lang.String name,
                               boolean value)
                        throws org.xml.sax.SAXException
        Sets a SAX feature on the underlying SAX parser. This can be useful to set parser-specific features. Though use this method with caution as it has the possibility of breaking the standard behaviour. An alternative to calling this method is to correctly configure an XMLReader object instance and call the setXMLReader(XMLReader)method
        Parameters:
        name - is the SAX feature name
        value - is the value of the SAX feature
        Throws:
        org.xml.sax.SAXException - if the XMLReader could not be created or the feature could not be changed.
      • read

        public Document read​(java.io.File file)
                      throws DocumentException

        Reads a Document from the given File

        Parameters:
        file - is the File to read from.
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(java.net.URL url)
                      throws DocumentException

        Reads a Document from the given URL using SAX

        Parameters:
        url - URL to read from.
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(java.lang.String systemId)
                      throws DocumentException

        Reads a Document from the given URL or filename using SAX.

        If the systemId contains a ':' character then it is assumed to be a URL otherwise its assumed to be a file name. If you want finer grained control over this mechansim then please explicitly pass in either a URLor a Fileinstance instead of a String to denote the source of the document.

        Parameters:
        systemId - is a URL for a document or a file name.
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(java.io.InputStream in)
                      throws DocumentException

        Reads a Document from the given stream using SAX

        Parameters:
        in - InputStream to read from.
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(java.io.Reader reader)
                      throws DocumentException
        Reads a Document from the given Reader using SAX
        Parameters:
        reader - is the reader for the input
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(java.io.InputStream in,
                             java.lang.String systemId)
                      throws DocumentException

        Reads a Document from the given stream using SAX

        Parameters:
        in - InputStream to read from.
        systemId - is the URI for the input
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(java.io.Reader reader,
                             java.lang.String systemId)
                      throws DocumentException

        Reads a Document from the given Reader using SAX

        Parameters:
        reader - is the reader for the input
        systemId - is the URI for the input
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • read

        public Document read​(org.xml.sax.InputSource in)
                      throws DocumentException

        Reads a Document from the given InputSource using SAX

        Parameters:
        in - InputSource to read from.
        Returns:
        the newly created Document instance
        Throws:
        DocumentException - if an error occurs during parsing.
      • isValidating

        public boolean isValidating()
        DOCUMENT ME!
        Returns:
        the validation mode, true if validating will be done otherwise false.
      • setValidation

        public void setValidation​(boolean validation)
        Sets the validation mode.
        Parameters:
        validation - indicates whether or not validation should occur.
      • isIncludeInternalDTDDeclarations

        public boolean isIncludeInternalDTDDeclarations()
        DOCUMENT ME!
        Returns:
        whether internal DTD declarations should be expanded into the DocumentType object or not.
      • setIncludeInternalDTDDeclarations

        public void setIncludeInternalDTDDeclarations​(boolean include)
        Sets whether internal DTD declarations should be expanded into the DocumentType object or not.
        Parameters:
        include - whether or not DTD declarations should be expanded and included into the DocumentType object.
      • isIncludeExternalDTDDeclarations

        public boolean isIncludeExternalDTDDeclarations()
        DOCUMENT ME!
        Returns:
        whether external DTD declarations should be expanded into the DocumentType object or not.
      • setIncludeExternalDTDDeclarations

        public void setIncludeExternalDTDDeclarations​(boolean include)
        Sets whether DTD external declarations should be expanded into the DocumentType object or not.
        Parameters:
        include - whether or not DTD declarations should be expanded and included into the DocumentType object.
      • isStringInternEnabled

        public boolean isStringInternEnabled()
        Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs. This proprety is enabled by default.
        Returns:
        DOCUMENT ME!
      • setStringInternEnabled

        public void setStringInternEnabled​(boolean stringInternEnabled)
        Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs
        Parameters:
        stringInternEnabled - DOCUMENT ME!
      • isMergeAdjacentText

        public boolean isMergeAdjacentText()
        Returns whether adjacent text nodes should be merged together.
        Returns:
        Value of property mergeAdjacentText.
      • setMergeAdjacentText

        public void setMergeAdjacentText​(boolean mergeAdjacentText)
        Sets whether or not adjacent text nodes should be merged together when parsing.
        Parameters:
        mergeAdjacentText - New value of property mergeAdjacentText.
      • isStripWhitespaceText

        public boolean isStripWhitespaceText()
        Sets whether whitespace between element start and end tags should be ignored
        Returns:
        Value of property stripWhitespaceText.
      • setStripWhitespaceText

        public void setStripWhitespaceText​(boolean stripWhitespaceText)
        Sets whether whitespace between element start and end tags should be ignored.
        Parameters:
        stripWhitespaceText - New value of property stripWhitespaceText.
      • isIgnoreComments

        public boolean isIgnoreComments()
        Returns whether we should ignore comments or not.
        Returns:
        boolean
      • setIgnoreComments

        public void setIgnoreComments​(boolean ignoreComments)
        Sets whether we should ignore comments or not.
        Parameters:
        ignoreComments - whether we should ignore comments or not.
      • getDocumentFactory

        public DocumentFactory getDocumentFactory()
        DOCUMENT ME!
        Returns:
        the DocumentFactory used to create document objects
      • setDocumentFactory

        public void setDocumentFactory​(DocumentFactory documentFactory)

        This sets the DocumentFactory used to create new documents. This method allows the building of custom DOM4J tree objects to be implemented easily using a custom derivation of DocumentFactory

        Parameters:
        documentFactory - DocumentFactory used to create DOM4J objects
      • getErrorHandler

        public org.xml.sax.ErrorHandler getErrorHandler()
        DOCUMENT ME!
        Returns:
        the ErrorHandler used by SAX
      • setErrorHandler

        public void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)
        Sets the ErrorHandler used by the SAX XMLReader.
        Parameters:
        errorHandler - is the ErrorHandler used by SAX
      • getEntityResolver

        public org.xml.sax.EntityResolver getEntityResolver()
        Returns the current entity resolver used to resolve entities
        Returns:
        DOCUMENT ME!
      • setEntityResolver

        public void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)
        Sets the entity resolver used to resolve entities.
        Parameters:
        entityResolver - DOCUMENT ME!
      • getXMLReader

        public org.xml.sax.XMLReader getXMLReader()
                                           throws org.xml.sax.SAXException
        DOCUMENT ME!
        Returns:
        the XMLReader used to parse SAX events
        Throws:
        org.xml.sax.SAXException - DOCUMENT ME!
      • setXMLReader

        public void setXMLReader​(org.xml.sax.XMLReader reader)
        Sets the XMLReader used to parse SAX events
        Parameters:
        reader - is the XMLReader to parse SAX events
      • getEncoding

        public java.lang.String getEncoding()
        Returns encoding used for InputSource (null means system default encoding)
        Returns:
        encoding used for InputSource
      • setEncoding

        public void setEncoding​(java.lang.String encoding)
        Sets encoding used for InputSource (null means system default encoding)
        Parameters:
        encoding - is encoding used for InputSource
      • setXMLReaderClassName

        public void setXMLReaderClassName​(java.lang.String xmlReaderClassName)
                                   throws org.xml.sax.SAXException
        Sets the class name of the XMLReader to be used to parse SAX events.
        Parameters:
        xmlReaderClassName - is the class name of the XMLReader to parse SAX events
        Throws:
        org.xml.sax.SAXException - DOCUMENT ME!
      • addHandler

        public void addHandler​(java.lang.String path,
                               ElementHandler handler)
        Adds the ElementHandler to be called when the specified path is encounted.
        Parameters:
        path - is the path to be handled
        handler - is the ElementHandler to be called by the event based processor.
      • removeHandler

        public void removeHandler​(java.lang.String path)
        Removes the ElementHandler from the event based processor, for the specified path.
        Parameters:
        path - is the path to remove the ElementHandler for.
      • setDefaultHandler

        public void setDefaultHandler​(ElementHandler handler)
        When multiple ElementHandler instances have been registered, this will set a default ElementHandler to be called for any path which does NOT have a handler registered.
        Parameters:
        handler - is the ElementHandler to be called by the event based processor.
      • resetHandlers

        public void resetHandlers()
        This method clears out all the existing handlers and default handler setting things back as if no handler existed. Useful when reusing an object instance.
      • getXMLFilter

        public org.xml.sax.XMLFilter getXMLFilter()
        Returns the SAX filter being used to filter SAX events.
        Returns:
        the SAX filter being used or null if no SAX filter is installed
      • setXMLFilter

        public void setXMLFilter​(org.xml.sax.XMLFilter filter)
        Sets the SAX filter to be used when filtering SAX events
        Parameters:
        filter - is the SAX filter to use or null to disable filtering
      • installXMLFilter

        protected org.xml.sax.XMLReader installXMLFilter​(org.xml.sax.XMLReader reader)
        Installs any XMLFilter objects required to allow the SAX event stream to be filtered and preprocessed before it gets to dom4j.
        Parameters:
        reader - DOCUMENT ME!
        Returns:
        the new XMLFilter if applicable or the original XMLReader if no filter is being used.
      • getDispatchHandler

        protected org.dom4j.io.DispatchHandler getDispatchHandler()
      • setDispatchHandler

        protected void setDispatchHandler​(org.dom4j.io.DispatchHandler dispatchHandler)
      • createXMLReader

        protected org.xml.sax.XMLReader createXMLReader()
                                                 throws org.xml.sax.SAXException
        Factory Method to allow alternate methods of creating and configuring XMLReader objects
        Returns:
        DOCUMENT ME!
        Throws:
        org.xml.sax.SAXException - DOCUMENT ME!
      • configureReader

        protected void configureReader​(org.xml.sax.XMLReader reader,
                                       org.xml.sax.helpers.DefaultHandler handler)
                                throws DocumentException
        Configures the XMLReader before use
        Parameters:
        reader - DOCUMENT ME!
        handler - DOCUMENT ME!
        Throws:
        DocumentException - DOCUMENT ME!
      • createContentHandler

        protected SAXContentHandler createContentHandler​(org.xml.sax.XMLReader reader)
        Factory Method to allow user derived SAXContentHandler objects to be used
        Parameters:
        reader - DOCUMENT ME!
        Returns:
        DOCUMENT ME!
      • createDefaultEntityResolver

        protected org.xml.sax.EntityResolver createDefaultEntityResolver​(java.lang.String systemId)