XML Schemas

XML Schema

XML Schema is an XML-based alternative to DTD:

In general words its used to validate the XML code..for eg we need to validate some data that is in xml format before we store into our DB. Validation might include syntax, tags, data types etc. Then how you will validate it? Here comes the XML Schema..

<xs:element name=”note”><xs:complexType>
<xs:sequence>
<xs:element name=”to” type=”xs:string”/>
<xs:element name=”from” type=”xs:string”/>
<xs:element name=”heading” type=”xs:string”/>
<xs:element name=”body” type=”xs:string”/>
</xs:sequence>
</xs:complexType></xs:element>

The Schema above is interpreted like this:

  • <xs:element name=”note”> defines the element called “note”
  • <xs:complexType> the “note” element is a complex type
  • <xs:sequence> the complex type is a sequence of elements
  • <xs:element name=”to” type=”xs:string”> the element “to” is of type string (text)
  • <xs:element name=”from” type=”xs:string”> the element “from” is of type string
  • <xs:element name=”heading” type=”xs:string”> the element “heading” is of type string
  • <xs:element name=”body” type=”xs:string”> the element “body” is of type string

Everyting is wrapped in “Well Formed” XML.

XML Schemas are More Powerful than DTD

  • XML Schemas are written in XML
  • XML Schemas are extensible to additions
  • XML Schemas support data types
  • XML Schemas support namespaces

Why Use an XML Schema?

With XML Schema, your XML files can carry a description of its own format.

With XML Schema, independent groups of people can agree on a standard for interchanging data.

With XML Schema, you can verify data.

XML Schemas Support Data Types

One of the greatest strength of XML Schemas is the support for data types:

  • It is easier to describe document content
  • It is easier to define restrictions on data
  • It is easier to validate the correctness of data
  • It is easier to convert data between different data types

 

More about XML Schemas:

In a typical project many schemas will be created. The schema designer is then confronted with this issue: “shall I define one targetNamespace for all the schemas, or shall I create a different targetNamespace for each schema, or shall I have some schemas with no targetNamespace?” What are the tradeoffs? What guidance would you give someone starting on a project that will create multiple schemas?

Here are the three design approaches for dealing with this issue:

     [1] Heterogeneous Namespace Design: 
         give each schema a different targetNamespace
     [2] Homogeneous Namespace Design:
         give all schemas the same targetNamespace
     [3] Chameleon Namespace Design: 
         give the "main" schema a targetNamespace and give no 
         targetNamespace to the "supporting" schemas (the no-namespace supporting 
         schemas will take-on the targetNamespace of the main schema, just
         like a Chameleon)

To describe and judge the merits of the three design approaches it will be useful to take an example and see each approach “in action”.

Example: XML Data Model of a Company

Imagine a project which involves creating a model of a company using XML Schemas. One very simple model is to divide the schema functionality along these lines:

Company schema
   Person schema
   Product schema

“A company is comprised of people and products.”

Here are the company, person, and product schemas using the three design approaches.

[1] Heterogeneous Namespace Design

This design approach says to give each schema a different targetNamespace. Below are the three schemas designed using this design approach. Observe that each schema has a different targetNamespace.

Product.xsd

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.product.org"
            xmlns="http://www.product.org"
            elementFormDefault="unqualified">
    <xsd:complexType name="ProductType">
        <xsd:sequence>
           <xsd:element name="Type" type="xsd:string"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Person.xsd

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.person.org"
            xmlns="http://www.person.org"
            elementFormDefault="unqualified">
    <xsd:complexType name="PersonType">
        <xsd:sequence>
           <xsd:element name="Name" type="xsd:string"/>
           <xsd:element name="SSN" type="xsd:string"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Company.xsd

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.company.org"
            xmlns="http://www.company.org"
            elementFormDefault="unqualified"
            xmlns:per="http://www.person.org"
            xmlns:pro="http://www.product.org">
    <xsd:import namespace="http://www.person.org"
                schemaLocation="Person.xsd"/>
    <xsd:import namespace="http://www.product.org"
                schemaLocation="Product.xsd"/>
    <xsd:element name="Company">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Person" type="per:PersonType" 
                             maxOccurs="unbounded"/>
                <xsd:element name="Product" type="pro:ProductType" 
                             maxOccurs="unbounded"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

Note the three namespaces that were created by the schemas:

   http://www.product.org
   http://www.person.org
   http://www.company.org

[2] Homogeneous Namespace Design

This design approach says to create a single, umbrella targetNamespace for all the schemas. Below are the three schemas designed using this approach. Observe that all schemas have the same targetNamespace.

Product.xsd

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.company.org"
            xmlns="http://www.product.org"
            elementFormDefault="qualified">
    <xsd:complexType name="ProductType">
        <xsd:sequence>
           <xsd:element name="Type" type="xsd:string"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Person.xsd

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.company.org"
            xmlns="http://www.person.org"
            elementFormDefault="qualified">
    <xsd:complexType name="PersonType">
        <xsd:sequence>
           <xsd:element name="Name" type="xsd:string"/>
           <xsd:element name="SSN" type="xsd:string"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Company.xsd

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.company.org"
            xmlns="http://www.company.org"
            elementFormDefault="qualified">
    <xsd:include schemaLocation="Person.xsd"/>
    <xsd:include schemaLocation="Product.xsd"/>
    <xsd:element name="Company">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Person" type="PersonType"
                             maxOccurs="unbounded"/>
                <xsd:element name="Product" type="ProductType"
                             maxOccurs="unbounded"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

Note that all three schemas have the same targetNamespace:

   http://www.company.org

Also note the mechanism used for accessing components in other schemas which have the same targetNamespace: <include>. When accessing components in a schema with a different namespace the <import> element is used, as we saw above in the Heterogeneous Design.

[3] Chameleon Namespace Design

This design approach says to give the “main” schema a targetNamespace, and the “supporting” schemas have no targetNamespace. In our example, the company schema is the main schema. The person and product schemas are supporting schemas. Below are the three schemas using this design approach:

Product.xsd (no targetNamespace)

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified">
    <xsd:complexType name="ProductType">
        <xsd:sequence>
           <xsd:element name="Type" type="xsd:string"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Person.xsd (no targetNamespace)

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            elementFormDefault="qualified">
    <xsd:complexType name="PersonType">
        <xsd:sequence>
           <xsd:element name="Name" type="xsd:string"/>
           <xsd:element name="SSN" type="xsd:string"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Company.xsd (main schema, uses the no-namespace-schemas)

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            targetNamespace="http://www.company.org"
            xmlns="http://www.company.org"
            elementFormDefault="qualified">
    <xsd:include schemaLocation="Person.xsd"/>
    <xsd:include schemaLocation="Product.xsd"/>
    <xsd:element name="Company">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Person" type="PersonType" 
                             maxOccurs="unbounded"/>
                <xsd:element name="Product" type="ProductType"
                             maxOccurs="unbounded"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

There are two things to note about this design approach:

First, as shown above, a schema is able to access components in schemas that have no targetNamespace, using <include>. In our example, the company schema uses the components in Product.xsd and Person.xsd (and they have no targetNamespace).

Second, note the chameleon-like characteristics of schemas with no targetNamespace:

  • The components in the schemas with no targetNamespace get namespace-coerced. That is, the components “take-on” the targetNamespace of the schema that is doing the <include>
    • For example, ProductType in Products.xsd gets implicitly coerced into the company targetNamespace. This is the reason that the Product element was able to reference ProductType in the default namespace using type=”ProductType”. Ditto for the PersonType in Person.xsd.

“Chameleon effect” … This is a term coined by Henry Thompson to describe the ability of components in a schema with no targetNamespace to take-on the namespace of other schemas. This is powerful!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s