xml learning notes


xml Introduction

What is XML?

XML is a scalable marker language

XML role?

1. Used to save data, and these data have self-descriptive 2. Can be used as a project or magic-resistant configuration file 3. Can be used as a format of network transmission data (mainly based on JSON)

xml file example:

<? XML Version = "1.0" encoding = "UTF-8"?> <! - <? XML Version = "1.0" encoding = "UTF-8"?> The above is the declaration of the XML file Version = " 1.0 "Version Represents XML version encoding =" UTF-8 "encoding Represents the encoding of the XML file itself -> <books> <! - Books Represents multiple book information -> <book sn =" sn123456> <! - Book Represents a book information SN attribute representation book sequence number -> <name> time simple history </ name> <! - name tag name-> <author> 金 </ author> <! Author -> <price> 75 </ price> <! p> 75 </ price> </ book> <book sn = "sn123456"> <! - Book indicates a book information SN property representation Book Sequence Number -> <Name> Java From Getting Started To Abandon </ Name> <! - Name Tag Representation Title -> <Author> Author </ author> <! - Author> <! - AUTHOR Label Representation Author -> <price> 30 </ price> <! - price bid book price -> </ book> </ books>

Note:<?xml version="1.0" encoding="UTF-8"?>中的<?xmlIf you want to write together, otherwise you will report an error.

xml syntax

xml Comment

Html and XML annotations are the same:"- Note ->

xml element

xml element refers to a portion from (including) starting tag until the end tag

Elements can contain other elements, text or mixtures of both. Elements can also have attributes.

The element refers to the content from the start label to the end tag, for example:<Title> Java Programming Ideology </ Title>

xml naming rules

xml Elements must follow the following naming rules:

  • Names can contain letters, numbers, and other characters, for example:
<book id = "SN213412341"> <! - Description a book -> <author> class </ author> <! - Author information of the description -> <name> Java programming </ Name> <! - Title -> <price> 9.9 </ price> <! - price -> </ book>
  • The name cannot start with a number or punctuation


  • Name cannot contain spaces


  • The element (label) in xil is also divided into single tags and dual tags:


xml attribute

XML tag properties and HTML tag properties are very similar, attributes can provide additional information for elements

You can write attributes on the label:

You can say some properties on a label.Values ​​for each attribute must be generated using quotation marks


Syntax rules

  • So the XML elements must be labeled (that is, closed)

所有 XML 元素都须有关闭标签

  • xml Tags Sensitive to Size


  • xml must be nested correctly

XML 必须正确地嵌套

  • xml documentation must have root elements

    The root element is a top element, no parent label, called top elements

    The root element is a top element without a parent label, and the only one is only

XML 文档必须有根元素

  • xml’s attribute value must be quoted

XML 文档必须有根元素

  • Special characters in xml

XML 中的特殊字符

  • Text area (CDATA)

    cdata syntax can tell the XML parser, the text content in CDATA, only plain text, no XML syntax resolution

    cdata format:

    <! [Cdata [here you can display the characters you entered, will not resolve XML]]>

文本区域(CDATA 区)

xml Analysis Technology

xml is a scalable markup language. Whether it is an HTML file or an XML file, they are tagged documents, they can use the DOM technology developed by W3C organization.


​​Document object indicates the entire document (can be an HTML document, or an XML document)

Early JDK provides us with two XML analysis technology DOM and SAX profile (already outdated, but we need to know these two technologies)

Dom Analysis Technology is the W3C organization, and all programming languages ​​are implemented in the characteristics of this parsing technology. Java has also achieved DOM technical parsing tags.

Sun upgraded the DOM parsing technology in JDK5 version: SAX (Simple API for XML) SAX resolution, which is not the same as W3C. It is the content that is currently resolving by callback telling the user through callbacks with similar event mechanisms. It is a line of reading XML files for parsing. Will you create a large amount of DOM objects. So it is used in memory when analyzing XML. And performance. Both are better than DOM parsing.

Analysis of third parties:

  • JDom packaged on the DOM

  • dom4jIn again, JDOM is encapsulated.

  • pull is mainly used in Android mobile phone development, which is very similar to SAX to resolve the XML file.

Dom4j is a third party analytical technology. We need to use third parties to provide us with a good class library to parse the XML file.

Dom4j analysis technology

Dom4J class library

Because DOM4J is not Sun company technology, it is a third-party company technology, we need to use DOM4J to download DOM4J’s JAR package.

Dom4j Programming Steps

  1. First load the XML file to create a Document object
  2. Get the root element object through the Document object
  3. Through the root elements .Elements; you can return a collection, this collection stores all the elements objects you specify.
  4. Find the child elements you want to modify, delete, do appropriate
  5. Save to your hard drive

The contents of books.xml files need to be analyzed

<? XML Version = "1.0" Encoding = "UTF-8"?> <Books> <book sn = "sn12341232"> <name> Evil sword spectrum </ name> <price> 9.9 <<< > Class Director </ author> </ book> <book sn = "SN12341231"> <name> Sunflower Collection </ name> <price> 999 <author> squad leader </ author> </ book> </ books >

Directory structure:


Analytic Process

package cn.pojo; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.Element; import org.dom4j.io.SAXReader; import org.junit.Test; import java.util.List Public class Dom4jtest {/ * Read Books.xml File Generate Book Class * / @Test Public Void Test2 () {// 1 Read Boos.xml File // Create a SaxReader Enterprise, to read XML configuration files, Generate Document Object SAXReader Saxreader = New SaxReader (); Document Document = NULL; // In JUnit Test, the relative path is starting from the module name to try {document = saxreader.read ("src / books.xml");} catch (DocumentException E) {E.PrintStackTrace ();} // 2 Get the root element assert document! = Null; Element RootElement = document.getrootelement (); system.out.println (RootElement); // 3 Elements Get Book Tags Objects // Element () and Elements () are used by label names List <element> books = rootelement.ements ("book"); // 4 traversal, come out to each Book tag to BOOK Class for (Element Book: Books) {// asxml () Convert tab object to label string ELE Ment nameElement = book.element ("name"); //system.out.println (nameElement.asxml ()); // getText () You can get the text content in the tab string nametext (); // Get the text content of the specified label name directly string priceText = book.elementText ("price"); string authortext = book.elementText ("author"); string sntext = book.attributevalue ("sn"); system.out.println New book (SnText, Nametext, Double.Parsedouble (Prictext), Authortext);}}}

Output results:

org.dom4j.tree.defaultElement@754ba872 [Element: <books attributes: [] />] book {SN = 'SN12341232', Name = 'evil sword spectrum', Price = 9.9, author = 'class teacher'} book {SN = 'SN12341231', Name = 'Sunflower Collection', Price = 99.99, Author = 'squad leader'}