Tech Area - XML Layout
New sample DTD for discussion
Posted 6th September 2000
This is a the latest draft specification. The layout below describes key objects we
have identified and the properties they must have, may have and cannot have. By doing this
we are defining relationships between these objects and saying how they behave.
To see sample data in this layout click here
The DTD or Document Type Declaration basically sets rules to which the XML data file
must conform. To see the DTD for the current XML file specification, click here
Design Notes
The design of the xml file has been altered significantly from previous
versions to reduce overall file size and to avoid duplication of information
within the xml file. A graphic schematic of the xml tree structure is shown
below.

|
xml
This node is the document root and can contain zero or one inuit_data nodes. |
inuit_data
This node is the root for the data in the file
- Required attributes
- authority - the person responsible for the classification and
descriptions in this file e.g. Mel Austen
- master - this has a value restricted by the DTD to either "Y"
or "N" and is a flag to tell us whether this file is the master XML data file or
a slave file from another source/author
- Optional attributes
- master_url - this attribute tells us the location of the master XML
document if this file is a slave e.g.
"http://www.pml-nematode.org.uk/tech/inuitData.xml"
- Contents
- 0 or more character, character_state and taxa nodes
|
taxa
A taxa node (or object) represents a lifeform and contains all the
details that pertain to it such as description(s), classification, image(s) etc.
- Required attributes
- name - this is a unique identifier for this character e.g.
Acantholaimus. It is an ID attribute.
- Optional attributes
- Contents
- 1 or more description nodes
|
character_state
A character state describes a type of character within a character group. An example of
a character group could be tail and a character state could be long tail. A character
state can be considered a recognisable characteristic or feature of a given taxa.
- Required attributes
- chst_id - this is a unique identifier for this character state e.g.
amphid44. It is an ID attribute.
- difficulty - this attribute has a value restricted to easy, difficult
or medium in the DTD and is a measure of the ease of identifying this character state
- Optional attributes
- Contents
- 1 or more char and text nodes
- 0 or more image nodes
|
character
Each character state must contain a reference to a character object to more generally
describe what part of the animal it is. This is made using the ch_ref
attribute which is linked via the DTD to the ch_id attribute in this
node.
- Required attributes
- ch_id - this is a unique identifier for this character e.g. amphid. It
is an ID attribute.
- Optional attributes
- Contents
|
description
A description node holds information describing a taxa. In this DTD it is permitted to
have alternative/multiple descriptions for a given taxa.
- Required attributes
- desc_id - this is a unique identifier for this description and is an ID
attribute.
- Optional attributes
- Contents
- 1 or more text and ch_st nodes
- 0 or more image nodes
|
char
The char node tells us which character group a character state belongs
to e.g. is it a tail or a mouth.
- Required attributes
- ch_ref - this attribute tells us what character group this character
state belongs to. It is an IDREF attribute which refers to the ch_id
attribute in a character node.
- Optional attributes
- Contents
|
ch_st
The ch_st node tells us which character states make up the current
description.
- Required attributes
- chst_ref - this attribute tells us which character state is being
referenced. It is and IDREF attribute which references the chst_id
attribute of a character_state node.
- Optional attributes
- Contents
|
text
The text node holds a text description for a taxa or character_state
- Required attributes
- Optional attributes
- lang - The language the description is written in
- Contents
|
image
The image node holds the source URL for an image of a taxa or
character state.
- Required attributes
- src - A source URL for the image (text)
- Optional attributes
- Contents
|
|
| Below is a summary in tabular form of the information above... |
Element Types Table
| Element Type |
Extends |
Text |
Elem. |
Content Model |
Attributes |
| ch_st |
|
|
|
|
chst_ref |
| char |
|
|
|
|
ch_ref |
| character |
|
|
|
|
ch_id |
| character_state |
|
|
X |
(char+ , text+ , image* ) |
chst_id, difficulty |
| description |
|
|
X |
(text+ , ch_st+ , image* ) |
desc_id |
| image |
|
|
|
|
src |
| inuit_data |
|
|
X |
(character* , character_state* , taxa* ) |
authority, master, master_url |
| taxa |
|
|
X |
(description+ ) |
name |
| text |
|
X |
|
|
lang |
| xml |
|
|
X |
(inuit_data? ) |
|
Attribute Types Table
| Attribute Name |
Element |
Data Type |
Constraints |
Default |
Required |
| authority |
inuit_data |
string |
|
|
X |
| ch_id |
character |
id |
|
|
X |
| ch_ref |
char |
idref |
|
|
X |
| chst_id |
character_state |
id |
|
|
X |
| chst_ref |
ch_st |
idref |
|
|
X |
| desc_id |
description |
id |
|
|
X |
| difficulty |
character_state |
enumeration |
difficult | medium | easy |
|
X |
| lang |
text |
string |
|
|
|
| master |
inuit_data |
enumeration |
N | Y |
|
X |
| master_url |
inuit_data |
string |
|
|
|
| name |
taxa |
id |
|
|
X |
| src |
image |
string |
|
|
X |
|