Motivating the Semantic Web - PowerPoint PPT Presentation

About This Presentation
Title:

Motivating the Semantic Web

Description:

Motivating the Semantic Web – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 27
Provided by: informati84
Category:

less

Transcript and Presenter's Notes

Title: Motivating the Semantic Web


1
MARCXTM Topic Maps Modeling of MARC
Bibliographic Information
2005.10.07
Hyun-Sil Lee, Yang-Seung Jeon, Sung-Kook
Han Semantic Web Services Research Group Won
Kwang University, Korea
2
Agenda
3
Overview MARC
  • MARC Machine-Readable Cataloging
  • standards used for the representation of
    bibliographic and related information for books
    and other library materials in machine-readable
    form and their communication to and from other
    computers.
  • All MARC Standards conform to ISO 27091996
    Information and documentation - Format for
    Information Exchange.
  • MARC was originally designed in the late 1960s
    to aid in the transfer of bibliographic data onto
    magnetic tape, and also to replace the printed
    catalog cards with electronic forms.
  • There are a number of implementation of MARC,
    including USMARC used in US, CAN/MARC used in
    Canada, and UKMARC used in Britain.
  • After discussions and minor changes to USMARC and
    CAN/MARC, MARC21 was evolved to harmonize both
    formats and to cover diverse types of resources
    including digital materials and Internet
    resources.
  • MARC accommodates extensive data elements
    describing all forms of materials susceptible to
    bibliographic description, as well as related
    information.

4
Family of MARC Formats
  • Bibliographic
  • a carrier for bibliographic information about
    printed and manuscript textual materials,
    computer files, maps, music, serials, visual
    materials and mixed materials.
  • Authorities
  • a carrier for information concerning the
    authorized forms of names, titles,subjects, and
    subject subdivisions to be used in constructing
    access points in MARC records, the forms of these
    names, subjects and subdivisions that should be
    used as references to the authorized form, and
    the relationships among these forms
  • Holdings
  • a carrier for holdings information for three
    types of bibliographic items single-part
    multipart serial and may include copy-specific
    information information peculiar to the holding
    institution information needed for local
    processing, maintenance or preservation version
    information.
  • Classification
  • a carrier for information about classification
    numbers and the captions associated with them
    that are formulated according to a specified
    authoritative classification scheme
  • Community Information
  • a carrier for descriptions of non-bibliographic
    resources that fulfil the information needs of a
    community.

5
Supporting Documentation of MARC
  • MARC 21 Specification for Record Structure,
    Character Sets, and Exchange Media
  • Character sets
  • MARC-8 (8-bit encoding)
  • UCS/UNICODE UTF-8 (8/16 bit encoding)
  • Repertoire of 15,000 characters
  • Latin Cyrillic Hebrew Arabic CJK
  • Code lists
  • Countries, Geographical Languages Sources
    Relators

6
MARC Record Format
Leader the first 24 characters of the record defining parameters for processing the record data elements that contain coded values and are identified by relative character position the first 24 characters of the record defining parameters for processing the record data elements that contain coded values and are identified by relative character position
Directory directory entries that contain the tag used in variable fields, starting location, and length of each field within the record constructed by computer from the bibliographic record, and can be reconstructed in the same way if any of the cataloging information is altered directory entries that contain the tag used in variable fields, starting location, and length of each field within the record constructed by computer from the bibliographic record, and can be reconstructed in the same way if any of the cataloging information is altered
Variable Field Variable Control Field 00X fields in the MARC 21 formats are variable control fields. either a single data element or a series of fixed-length data elements identified by relative character position
Variable Field Variable Data Field Indicators The first two characters which interpret or supplement the data found in the field. Subfield codes Two characters that precede each data element within a field that requires separate manipulation
7
MARC Record Format Example
8
MARC Record Format Example
Sign Post
9
Formalization of MARC
ltMARC21RecordgtltLeadergtltDirectorygtltVariableFiel
dgt ltDirectorygtltDirectoryElementgt
ltDirectoryElementgtltTaggtltLengthgtltPositiongt
ltVariableFieldgtltControlFieldgtltDataFieldgt
ltControlFieldgtltControlNumbergtltControlFieldEleme
ntgt ltDataFieldgtltTaggtltIndicatorgtltSubFieldgt
ltIndicatorgtltFirstIndicatorgtltSecondIndicatorgt ltS
ubFieldgtltSubFieldCodegtltSubFieldValuegt
10
Problems with MARC
  • Lack of expandability due to rigorous record
    formats, since it was originally intended for the
    production of printed catalogue cards in 1960s
  • Difficulties in representing bibliographic
    relationships
  • Ambiguities in describing MARC records
  • Incompatibilities between other MARC formats
    since the various library systems have invented
    their own non-standard peculiarities in order to
    handle local bibliographic materials
  • Weaknesses in describing bibliographic attributes
    of digitized resources

11
MARCXML
MARC21 (2709)Records
MARC21 (XML) Records
Tagging Transformations
Character Set Conversion
Dublin Core Records
MODS Records
Other XML Formats
HTML Output
MARC Validation
12
MARCXML
  • MARCXML a framework for working with MARC data
    in a XML environment
  • Design Considerations and Features
  • Simple and Flexible MARC XML Schema for
    representing a complete MARC record in XML
  • Supports all MARC encoded data regardless of
    format
  • Lossless Conversion of MARC to XML
  • Roundtrip ability from XML back to MARC
  • Data Presentation and Data Conversion
  • Extensibility
  • A component-oriented, extensible architecture
    allowing users to plug and play different
    software pieces to build custom solutions

13
MARCXML Example
14
MODS
  • MODS Metadata Objects Description Schema
  • XML-based descriptive metadata standard that
    includes a subset of data elements derived from
    MARC21
  • Features
  • MODS is intended to complement other metadata
    formats. MODS provides a richer bibliographic
    element set than Dublin Core.
  • MODS has a high level of compatibility with MARC
    records because it inherits the semantics of the
    equivalent data elements in the MARC21
    bibliographic format.
  • In MODS some elements that appear in various
    fields in MARC have been repackaged into one. So
    MODS can define 19 upper metadata elements.
  • MODS takes advantage of the XML environment. It
    uses language-based tags rather than the numeric
    tags traditional to MARC.
  • MODS also has flexible linking mechanisms by
    providing for all the top-level elements with
    attributes such as xlink and ID.
  • MODS accommodates special requirements for
    digital resources.

15
MODS Example
16
Topic Maps Modeling of MARC 21
  • Requirements for MARC Modeling
  • A model should be able to support the full set of
    data elements in MARC21 to achieve seamless
    compatibility with MARC formats.
  • This is a practical requirement in order to
    embrace the current circumstances even though it
    is awkward.
  • It should have the same expressive power as
    metadata.
  • This implies that the model should be realized
    with semantic descriptors to be used in an XML
    environment instead of obsolete alphanumeric
    codes.
  • The use of attributes should be minimized to
    maintain consistency and increase readability.
  • It should be able to maintain the structure of
    MARC record format
  • A model does not intend to develop bibliographic
    metadata system based on MARC.
  • A model can be handled without expertise in MARC
    to achieve the usability of the model.
  • A model should be simple and lightweight for
    system implementation and harmonization with
    other models.

17
UML diagram of MARC Modeling
18
MARCXTM Implementation
19
XTM Realization of MARC Specification
  • DataField ltassociationgt of data item,
    indicators, and subfield codes

20
XTM Realization of MARC Specification
  • Hiding the real data value by topic abstraction

lttopic id"TypeOfPersonalNameEntryElement"gt
ltbaseNamegt ltbaseNameStringgt Type of
personal name entry element lt/baseNameStringgt
lt/baseNamegt ltoccurrencegt ltinstanceOfgt
lttopicRef xlinkhref"Forename"/gt lt/instanceOfgt
ltresourceDatagt 0 lt/resourceDatagt
lt/occurrencegt ltoccurrencegt
ltinstanceOfgt lttopicRef xlinkhref"Surname"/gt
lt/instanceOfgt ltresourceDatagt 1
lt/resourceDatagt lt/occurrencegt
ltoccurrencegt ltinstanceOfgt lttopicRef
xlinkhref"FamilyName"/gt lt/instanceOfgt
ltresourceDatagt 3 lt/resourceDatagt lt/occurrencegt lt/t
opicgt
21
MARCXTM for MARC Specification
22
XTM Realization of MARC Records
  • MARC Records
  • Complex to maintain MARC structure due to its
    idiosyncratic dependency between indicators and
    subfield code
  • Difficult to realize the seamless compatible with
    MARC records
  • Repeatability of subfield elements are
    individually defined in MARC specification.
  • XTM supports for MARC modeling
  • XTM does not provide multiple instances for
    ltoccurrencegt.
  • Difficult to define record schema with
    ltassociationgt.

23
XTM Realization of MARC Records
24
MARCXTM for MARC Records
25
Conclusions
  • MARCXTM Topic Maps-based implementation of MARC
    21
  • MARCXTM for MARC Specification
  • MARCXTM for MARC Records
  • Application of Topic Maps paradigm to
    bibliographic information system
  • Seamless compatible with MARC 21
  • expressive power as metadata
  • XTM is inappropriate to represent MARC format due
    to its idiosyncratic structure and dependency
    between data elements.
  • Metadata models similar to Dubline Core or MODS
    are necessary for XTM modeling of MARC.
  • FRBR (Functional Requirements for Bibliographic
    Records) framework is an attractive model for XTM
    modeling of bibliographic information system.

26
Thank you!!!
Write a Comment
User Comments (0)
About PowerShow.com