Tech Area - XML Layout

 

New sample DTD for discussion

Posted 6th September 2000

 

This is a the latest draft specification. The layout below describes key objects we have identified and the properties they must have, may have and cannot have. By doing this we are defining relationships between these objects and saying how they behave.

 

To see sample data in this layout click here

 

The DTD or Document Type Declaration basically sets rules to which the XML data file must conform. To see the DTD for the current XML file specification, click here

 

Design Notes

The design of the xml file has been altered significantly from previous versions to reduce overall file size and to avoid duplication of information within the xml file. A graphic schematic of the xml tree structure is shown below.

 

schema.gif (15278 bytes)

 

xml

This node is the document root and can contain zero or one inuit_data nodes.

inuit_data

This node is the root for the data in the file

  • Required attributes
    • authority - the person responsible for the classification and descriptions in this file e.g. Mel Austen
    • master - this has a value restricted by the DTD to either "Y" or "N" and is a flag to tell us whether this file is the master XML data file or a slave file from another source/author
  • Optional attributes
    • master_url - this attribute tells us the location of the master XML document if this file is a slave e.g. "http://www.pml-nematode.org.uk/tech/inuitData.xml"
  • Contents
    • 0 or more character, character_state and taxa nodes

taxa

A taxa node (or object) represents a lifeform and contains all the details that pertain to it such as description(s), classification, image(s) etc.

  • Required attributes
    • name - this is a unique identifier for this character e.g. Acantholaimus. It is an ID attribute.
  • Optional attributes
    • None
  • Contents
    • 1 or more description nodes

character_state

A character state describes a type of character within a character group. An example of a character group could be tail and a character state could be long tail. A character state can be considered a recognisable characteristic or feature of a given taxa.

  • Required attributes
    • chst_id - this is a unique identifier for this character state e.g. amphid44. It is an ID attribute.
    • difficulty - this attribute has a value restricted to easy, difficult or medium in the DTD and is a measure of the ease of identifying this character state
  • Optional attributes
    • None
  • Contents
    • 1 or more char and text nodes
    • 0 or more image nodes

character

Each character state must contain a reference to a character object to more generally describe what part of the animal it is. This is made using the ch_ref attribute which is linked via the DTD to the ch_id attribute in this node.

  • Required attributes
    • ch_id - this is a unique identifier for this character e.g. amphid. It is an ID attribute.
  • Optional attributes
    • None
  • Contents
    • None

description

A description node holds information describing a taxa. In this DTD it is permitted to have alternative/multiple descriptions for a given taxa.

  • Required attributes
    • desc_id - this is a unique identifier for this description and is an ID attribute.
  • Optional attributes
    • None
  • Contents
    • 1 or more text and ch_st nodes
    • 0 or more image nodes

char

The char node tells us which character group a character state belongs to e.g. is it a tail or a mouth.

  • Required attributes
    • ch_ref - this attribute tells us what character group this character state belongs to. It is an IDREF attribute which refers to the ch_id attribute in a character node.
  • Optional attributes
    • None
  • Contents
    • None

ch_st

The ch_st node tells us which character states make up the current description.

  • Required attributes
    • chst_ref - this attribute tells us which character state is being referenced. It is and IDREF attribute which references the chst_id attribute of a character_state node.
  • Optional attributes
    • None
  • Contents
    • None

text

The text node holds a text description for a taxa or character_state

  • Required attributes
    • None
  • Optional attributes
    • lang - The language the description is written in
  • Contents
    • A text description

image

The image node holds the source URL for an image of a taxa or character state.

 

  • Required attributes
    • src - A source URL for the image (text)
  • Optional attributes
    • None
  • Contents
    • None
Below is a summary in tabular form of the information above...

Element Types Table

Element Type Extends Text Elem. Content Model Attributes
ch_st chst_ref
char ch_ref
character ch_id
character_state X (char+ , text+ , image* ) chst_id, difficulty
description X (text+ , ch_st+ , image* ) desc_id
image src
inuit_data X (character* , character_state* , taxa* ) authority, master, master_url
taxa X (description+ ) name
text X lang
xml X (inuit_data? )


Attribute Types Table

Attribute Name Element Data Type Constraints Default Required
authority inuit_data string X
ch_id character id X
ch_ref char idref X
chst_id character_state id X
chst_ref ch_st idref X
desc_id description id X
difficulty character_state enumeration difficult | medium | easy X
lang text string
master inuit_data enumeration N | Y X
master_url inuit_data string
name taxa id X
src image string X