What is XML and what for its used?

Before going into webservices, we must know what is xml and what for it is used. Hence we will start up with basic xml concepts..

XML stands for eXtensible Markup Language.

XML is designed to transport and store data.

HTML was designed to display data

Main Objective: I hope every1 might got one question in mind like why we need to transfer data thru xml where we can directly transport from frontend language to DB. Eg: say php to mysql

U too doubted on this point right? 😉 Well my seniors explained the good reason and purpose of using XML. According to their theory this is mainly used to transfer data in between two languages(say flash to php) in an simple understand structured data. Henceforth XML come to existence. Even we can use dynamic data to generate in an template manner easily thru XML. Coming to alternatives there are many alternatives to XML in the market but JSON(JavaScript Object Notification) is one of the recent alternative that arrives in the market with good scalability.

What is XML?

  • XML stands for EXtensible Markup Language
  • XML is a markup language much like HTML
  • XML was designed to carry data, not to display data
  • XML tags are not predefined. You must define your own tags
  • XML is designed to be self-descriptive
  • XML is a W3C Recommendation

The Difference Between XML and HTML

XML is not a replacement for HTML.

XML and HTML were designed with different goals:

  • XML was designed to transport and store data, with focus on what data is
  • HTML was designed to display data, with focus on how data looks

HTML is about displaying information, while XML is about carrying information.

XML Does Not DO Anything

Maybe it is a little hard to understand, but XML does not DO anything. XML was created to structure, store, and transport information.

The following example is a note to Tove, from Jani, stored as XML:

<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don’t forget me this weekend!</body>
</note>

The note above is quite self descriptive. It has sender and receiver information, it also has a heading and a message body.

But still, this XML document does not DO anything. It is just information wrapped in tags. Someone must write a piece of software to send, receive or display it.

XML Separates Data from HTML

If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes.

With XML, data can be stored in separate XML files. This way you can concentrate on using HTML/CSS for display and layout, and be sure that changes in the underlying data will not require any changes to the HTML.

With a few lines of JavaScript code, you can read an external XML file and update the data content of your web page.

XML Simplifies Data Sharing

In the real world, computer systems and databases contain data in incompatible formats.

XML data is stored in plain text format. This provides a software- and hardware-independent way of storing data.

This makes it much easier to create data that can be shared by different applications.

XML Simplifies Data Transport

One of the most time-consuming challenges for developers is to exchange data between incompatible systems over the Internet.

Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible applications

XML Documents Form a Tree Structure

XML documents must contain a root element. This element is “the parent” of all other elements.

The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree.

All elements can have sub elements (child elements):

<root>
<child>
<subchild>…..</subchild>
</child>
</root>

The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have children. Children on the same level are called siblings (brothers or sisters).

All elements can have text content and attributes (just like in HTML).

Eg:

<bookstore>
<book category=”COOKING”>
<title lang=”en”>Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category=”CHILDREN”>
<title lang=”en”>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>

The root element in the example is <bookstore>. All <book> elements in the document are contained within <bookstore>.

The <book> element has 4 children: <title>,< author>, <year>, <price>.

XML Tags are Case Sensitive

XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.

XML Elements Must be Properly Nested

XML Documents Must Have a Root Element

XML Attribute Values Must be Quoted

XML elements can have attributes in name/value pairs just like in HTML.

In XML, the attribute values must always be quoted.

Study the two XML documents below. The first one is incorrect, the second is correct:

<note date=12/11/2007>
<to>Tove</to>
<from>Jani</from>
</note>
<note date=”12/11/2007″>
<to>Tove</to>
<from>Jani</from>
</note>

Comments in XML

The syntax for writing comments in XML is similar to that of HTML.

<!– This is a comment –>

White-space is Preserved in XML

HTML truncates multiple white-space characters to one single white-space:

HTML: Hello           Tove
Output: Hello Tove

With XML, the white-space in a document is not truncated.

XML Attributes Must be Quoted

Attribute values must always be quoted. Either single or double quotes can be used. For a person’s sex, the person element can be written like this:

<person sex=”female”>

or like this:

<person sex=’female’>

If the attribute value itself contains double quotes you can use single quotes, like in this example:

<gangster name=’George “Shotgun” Ziegler’>

XML Elements vs. Attributes

Take a look at these examples:

<person sex=”female”>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
<person>
<sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>

In the first example sex is an attribute. In the last, sex is an element. Both examples provide the same information.

There are no rules about when to use attributes or when to use elements. Attributes are handy in HTML. In XML my advice is to avoid them. Use elements instead.

Avoid XML Attributes?

Some of the problems with using attributes are:

  • attributes cannot contain multiple values (elements can)
  • attributes cannot contain tree structures (elements can)
  • attributes are not easily expandable (for future changes)

Attributes are difficult to read and maintain. Use elements for data. Use attributes for information that is not relevant to the data.

Don’t end up like this:

<note day=”10″ month=”01″ year=”2008″
to=”Tove” from=”Jani” heading=”Reminder”
body=”Don’t forget me this weekend!”>
</note>
############################################################################

Uniform Resource Identifier (URI)

Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource.

The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address. Another, not so common type of URI is the Universal Resource Name (URN).

In our examples we will only use URLs.

Default Namespaces

Defining a default namespace for an element saves us from using prefixes in all the child elements. It has the following syntax:

xmlns=”namespaceURI

This XML carries HTML table information:

<table xmlns=”http://www.w3.org/TR/html4/”&gt;
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>

This XML carries information about a piece of furniture:

<table xmlns=”http://www.w3schools.com/furniture”&gt;
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>

Namespaces in Real Use

XSLT is an XML language that can be used to transform XML documents into other formats, like HTML.

In the XSLT document below, you can see that most of the tags are HTML tags.

The tags that are not HTML tags have the prefix xsl, identified by the namespace xmlns:xsl=”http://www.w3.org/1999/XSL/Transform&#8221;:

<?xml version=”1.0″ encoding=”UTF-8?><xsl:stylesheet version=”1.0″
xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”><xsl:template match=”/”>
<html>
<body>
<h2>My CD Collection</h2>
<table border=”1″>
<tr>
<th style=”text-align:left”>Title</th>
<th style=”text-align:left”>Artist</th>
</tr>
<xsl:for-each select=”catalog/cd”>
<tr>
<td><xsl:value-of select=”title”/></td>
<td><xsl:value-of select=”artist”/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template></xsl:stylesheet>

Valid XML Documents

A valid XML document is not the same as a well formed XML document.

The first rule, for a valid XML document, is that it must be well formed (see previous paragraph).

The second rule is that a valid XML document must conform to a document type.

Rules that defines legal elements and attributes for XML documents are often called document definitions, or document schemas.

When to Use a Document Definition?

A document definition is the easiest way to provide a reference to the legal elements and attributes of a document.

A document definition also provides a common reference that many users (developers) can share.

A document definition provides a standardization that makes life easier.

When NOT to Use a Document Definition?

XML does not require a document definition.

When you are experimenting with XML, or when you are working with small XML files, creating document definitions may be a waste of time.

If you develop applications, wait until the specification is stable, before you add a document definition. Otherwise your  software might stop working, because of validation errors

Document Definitions

There are different types of document definitions that can be used with XML:

  • The original Document Type Definition (DTD)
  • The newer, and XML based, XML Schema

XML DTD

An XML document with correct syntax is called “Well Formed”.

An XML document validated against a DTD is “Well Formed” and “Valid”.

Valid XML Documents

A “Valid” XML document is a “Well Formed” XML document, which also conforms to the rules of a DTD:

<?xml version=”1.0″ encoding=”UTF-8?>
<!DOCTYPE note SYSTEM “Note.dtd”>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don’t forget me this weekend!</body>
</note>

The DOCTYPE declaration, in the example above, is a reference to an external DTD file. The content of the file is shown in the paragraph below.

XML DTD

The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements:

<!DOCTYPE note
[
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>

The DTD above is interpreted like this:

  • !DOCTYPE note defines that the root element of the document is note
  • !ELEMENT note defines that the note element contains four elements: “to, from, heading, body”
  • !ELEMENT to defines the to element to be of type “#PCDATA”
  • !ELEMENT from defines the from element to be of type “#PCDATA”
  • !ELEMENT heading defines the heading element to be of type “#PCDATA”
  • !ELEMENT body defines the body element to be of type “#PCDATA”
Note #PCDATA means parse-able text data.

Why Use a DTD?

With a DTD, your XML files can carry a description of its own format.

With a DTD, independent groups of people can agree on a standard for interchanging data.

With a DTD, you can verify that the data you receive from the outside world is valid.

XML Schema

XML Schema is an XML-based alternative to DTD:

In general words its used to validate the XML code..for eg we need to validate some data that is in xml format before we store into our DB. Validation might include syntax, tags, data types etc. Then how you will validate it? Here comes the XML Schema..

<xs:element name=”note”><xs:complexType>
<xs:sequence>
<xs:element name=”to” type=”xs:string”/>
<xs:element name=”from” type=”xs:string”/>
<xs:element name=”heading” type=”xs:string”/>
<xs:element name=”body” type=”xs:string”/>
</xs:sequence>
</xs:complexType></xs:element>

The Schema above is interpreted like this:

  • <xs:element name=”note”> defines the element called “note”
  • <xs:complexType> the “note” element is a complex type
  • <xs:sequence> the complex type is a sequence of elements
  • <xs:element name=”to” type=”xs:string”> the element “to” is of type string (text)
  • <xs:element name=”from” type=”xs:string”> the element “from” is of type string
  • <xs:element name=”heading” type=”xs:string”> the element “heading” is of type string
  • <xs:element name=”body” type=”xs:string”> the element “body” is of type string

Everyting is wrapped in “Well Formed” XML.

XML Schemas are More Powerful than DTD

  • XML Schemas are written in XML
  • XML Schemas are extensible to additions
  • XML Schemas support data types
  • XML Schemas support namespaces

Why Use an XML Schema?

With XML Schema, your XML files can carry a description of its own format.

With XML Schema, independent groups of people can agree on a standard for interchanging data.

With XML Schema, you can verify data.

XML Schemas Support Data Types

One of the greatest strength of XML Schemas is the support for data types:

  • It is easier to describe document content
  • It is easier to define restrictions on data
  • It is easier to validate the correctness of data
  • It is easier to convert data between different data types

Reference: w3schools

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s