COP 4814
Florida International University
Kip Irvine
XML Schema Basics, and Defining
Simple Types
Updated: 2/23/2016
Based on Goldberg, Chapters 9 & 10
Irvine COP 4814
XML Schema Overview
• Also known as XML Schema Definition (XSD)
• Specifies the structure of valid XML documents
– defines a set of elements, their relationships to each
other, and the attributes that they can contain.
• Designed to address shortcomings of DTDs
– has a system of data types
– lets you define global and local elements
– likely to replace DTDs in the future as the standard
schema language
Latest info: http://w3.org/XML/Schema
Irvine COP 4814
Data Type Categories
• Atomic type
– XML element that only contains text
• List type
– collection of items
• Complex type
– XML element that contains child elements and/or
attributes
Irvine COP 4814
Sample XML File & Schema
<?xml version="1.0"?>
<wonder>
<name>Colossus of Rhodes</name>
<location>Greece</location>
<height>107</height>
</wonder>
<?xml version="1.0"?> incomplete Schema file
<element name="wonder">
<complexType>
<sequence>
<element name="name" type="string"/>
<element name="location" type="string"/>
<element name="height" type="string"/>
</sequence>
</complexType>
</element>
Irvine COP 4814
Complete Schema File
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<element name="wonder">
<complexType>
<sequence>
<element name="name" type="string"/>
<element name="location" type="string"/>
<element name="height" type="string"/>
</sequence>
</complexType>
</element>
</xs:schema>
Irvine COP 4814
Linking the XML Document
To the XML Schema file: Visual Studio doesn’t need this
extra line—you can just assign
<?xml version="1.0"?> a value to the Schemas
property of the XML file.
<wonder
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="09-06.xsd" >
<name>Colossus of Rhodes</name>
<location>Greece</location>
<height>107</height>
</wonder> Web location, network
path, or local file
Irvine COP 4814
XML Annotations
Structured comments that can be processed by XML
parsers. Can appear anywhere, multiple times.
<xs:annotation>
<xs:documentation>
This XML Schema will be used to validate
documents from the student registration system.
</xs:documentation>
</xs:annotation>
Irvine COP 4814
Atomic Datatypes
• Contain a single value (cannot be divided).
• Based on one of the built-in types.
– Example: type="xs:string
• Restrictions can be included on their ranges and
character patterns, allowing you to create subtypes.
• Type categories:
– string, integer, boolean, date, decimal, etc.
Irvine COP 4814
Simple Types
Some of the more common types:
<xs:element name="height"
type="xs:string"/>
<xs:element name="year_built"
type="xs:integer"/>
<xs:element name="cost"
type="xs:decimal"/>
<xs:element name="is_standing"
type="xs:boolean"/>
<xs:element name="image"
type="xs:anyURI"/>
Web location, network
path, or local file
Irvine COP 4814
Irvine COP 4814
Standard Date/Time Formats
• xs:date:
yyyy-mm-dd "2005-04-26"
• xs:time:
hh:mm:ss "16:21:00"
• xs:dateTime
yyyy-mm-ddThh:mm:ss
"2005-04-26T16:21:00"
Irvine COP 4814
xs:duration
• xs:duration:
– PnYnMnDTnHnMnS
• 3 months, 4 days, 6 hours, 17 minutes:
– "P3M4DT6H17M"
• 90 days:
– "P90D"
• 4 days and 6 hours:
– "P4DT6H"
Irvine COP 4814
Other Date Types
Type Examples (April)
xs:gYear "1965" (April 1965)
xs:gMonth "--04" (April 26)
xs:gYearMonth "1965-04" (26th day)
xs:gMonthDay "--0426"
xs:gDay "---26"
Irvine COP 4814
Custom Type
• Identify the XML element you wish to define
• The "base" attribute identifies an existing type.
• Example: string of <= 1024 characters:
<xs:element name="story"> this custom type is
<xs:simpleType> anonymous
<xs:restriction base="xs:string">
<xs:length value="1024"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Irvine COP 4814
Named Custom Type
• Alternatively you can add a name attribute to the
xs:simpleType element:
<xs:element name="story">
<xs:simpleType name="story_type">
<xs:restriction base="xs:string">
<xs:length value="1024"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Irvine COP 4814
Applying a Custom Type
• The same named type can be used multiple times:
<xs:element name="story" type="story_type"/>
<xs:element name="summary" type="story_type"/>
<xs:element name="another_story" type="story_type"/>
Irvine COP 4814
Limiting Values to a Range
• Use xs:minInclusive and xs:maxInclusive:
<xs:element name="student_age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="10">
<xs:maxInclusive value="120">
</xs:restriction>
</xs:simpleType>
</xs:element>
Also possible:
minExclusive, maxExclusive
Irvine COP 4814
Set of Possible Values
Use xs:enumeration
<xs:element name="student_level">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="undergraduate"/>
<xs:enumeration value="graduate"/>
<xs:enumeration value="unclassified"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Irvine COP 4814
Specifying an Exact Length
Example:
<xs:element name="panther_id">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="7"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Irvine Also possible:
xs:minLength, xs:maxLength
COP 4814
Specifying a Matching Pattern
• Uses regular expression syntax
• Suppose the account_id element must contain
AB, followed by digits:
<xs:element name="account_id">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="AB\d+"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Irvine COP 4814
Deriving a List Type
• When the XML file contains a list of values
– Example: dates when a student declared or
changed majors:
<xs:element name="catalog_dates"> this custom type is
<xs:simpleType> anonymous
<xs:list itemType="xs:date"/>
</xs:simpleType>
</xs:element>
Irvine COP 4814
Deriving a Named List Type
Use a named type if you plan to apply it more than
once.
<xs:simpleType name="dateList">
<xs:list itemType="xs:date"/>
</xs:simpleType>
.
<!-- create instances: -->
<xs:element name="catalog_dates"
type="dateList"/>
<xs:element name="enrollment_dates"
type="dateList"/>
Irvine COP 4814
Summary
• Use built-in types when you have no particular
restrictions on the values
• Use simple (atomic) derived types to control ranges
and lengths
• Use list types for repeated items
• Chapter 11 explains how to create complex types.
Irvine COP 4814