Title: Construction and Pedagogical Use of Digital Archives
1Construction and Pedagogical Use of Digital
Archives
One Introduction to XML
lt?xml version"1.0"?gt lt?xml-stylesheet
href"jonson.xsl" type"text/xsl"?gt lt!DOCTYPE
TEI.2 PUBLIC "-//TEI P4//DTD Main DTD Driver
File//EN" "http//www.tei-c.org/Guidelines/DTD/tei
2.dtd" lt!ENTITY TEI.XML 'INCLUDE'gt lt!ENTITY
TEI.mixed 'INCLUDE'gt lt!ENTITY TEI.drama
'INCLUDE'gt lt!ENTITY TEI.verse
'INCLUDE'gt lt!ENTITY TEI.prose
'INCLUDE'gt lt!ENTITY TEI.figures
'INCLUDE'gt lt!ENTITY TEI.linking
'INCLUDE'gt lt!ENTITY TEI.transcr
'INCLUDE'gt lt!ENTITY TEI.analysis
'INCLUDE'gt lt!ENTITY TEI.textcrit
'INCLUDE'gt lt!ENTITY ISOlat1 SYSTEM
'http//www.tei-c.org/Entity_Sets/Unicode/iso-lat1
.ent'gt ISOlat1 lt!ENTITY ISOlat2 SYSTEM
'http//www.tei-c.org/Entity_Sets/Unicode/iso-lat2
.ent'gt ISOlat2 lt!ENTITY ISOnum SYSTEM
'http//www.tei-c.org/Entity_Sets/Unicode/iso-num.
ent'gt ISOnum lt!ENTITY ISOpub SYSTEM
'http//www.tei-c.org/Entity_Sets/Unicode/iso-pub.
ent'gt ISOpub lt!NOTATION jpg SYSTEM
"IMAGES/JPEG"gt lt!ENTITY fig130-001-1 SYSTEM
"fig130-001-1.jpg" NDATA jpggt gt
- Washington University
- 25 May 2006
2What is XML?
- A meta-language, i.e. a language for creating
languages - A platform- and application-independent protocol
for marking-up structured documents - A descriptive mark-up system, i.e. one that
describes and categorizes parts of a document - An extensible strategy that allows designers to
scale the scope of mark-up to match the needs of
a project
3Basic XML Components
- Document, or instance
- DTD, defining the grammar and syntax of the
tagging scheme - XSLT files for transforming documents
- Associated external resources such entities or
CSS files - Associated tools such as XML parser
- Additional protocols such as XSL-FO, XLinks,
XPointers, and Schemas
4Basic Rules of XML
- The XML declaration must begin the document
- Every opening lttaggt must have an accompanying
closing lt/taggt - All elements must be nested hierarchically
- Empty tags must end with /gt, for example, lttag/gt
- The document must contain exactly one root
element that completely contains all other
elements - All attribute values must be within quotes
- The characters "lt" and "" are reserved and must
be used only to begin tags and entity references
respectively - The only native XML entity references are amp,
lt, gt, apos, and quot
5XML Good Practice
- Do not include whitespaces in tag names
- Do not include reserved XML characters or
characters that have special meaning in
processing languages like perl - Do not start a tag name with a number or a
punctuation character - Tag names are case-sensitive -- lttaggt is
different from ltTAGgt - Take care when using characters beyond the core
7-bit ASCII set, for example ë, ß, or
6Sample XML Document
- 1 lt?xml version"1.0"?gt
- 2 lt?xml-stylesheet hrefpotato.xsl"
type"text/xsl"?gt - 3 lt!DOCTYPE russett SYSTEM Idaho.dtd
- 4 lt!ENTITY spud.xml 'INCLUDE'gt
- 5 lt!ENTITY chips Rufflesgt
- 6 lt!NOTATION jpg SYSTEM images/spudsgt
- 7 lt!ENTITY crispy SYSTEM smashed.jpg NDATA
jpggt - gt
- 8 ltrussettgt
- 9 lttitlegtMy Favorite Tuber Kingslt/titlegt
- ltsnackgt
- 11 ltbasic n1 typebreakfastgtHash
Brownslt/basicgt - 12 ltbasic n2 typelunchgtchipslt/basicgt
- ltbasic n3 typedinnergtMashed (see
- 13 ltpic refsmashed.jpggt)
- 14 lt!-- Oops, I just started the Atkins
Diet...forget all this --gt - lt/basicgt
- lt/snackgt
- lt/russettgt