Chapter 9 Designing Metadata - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Chapter 9 Designing Metadata

Description:

How long does it take to create/modify/review (time) ... Inclusion metadata: Stands in for external content. ... MEDIA ID='m1' URL='dabw.jpg' SIZE 100,300 /Size ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 53
Provided by: ccNct
Category:

less

Transcript and Presenter's Notes

Title: Chapter 9 Designing Metadata


1
Chapter 9Designing Metadata
2
Metadata Definition
  • Data about data, information about information
  • Metadata is a structured description of a data
    object
  • Metadata encodes all physical data (contained in
    software and other media) and knowledge-containing
    information (contained in employees and various
    media) from inside and outside an organization,
    including information about physical data,
    business and technical processes, rules and
    constraints of the data, and structures of the
    data, used by a corporation
  • Metadata can be used to describe the datas
    behavior, processes, rules, and structure

3
Metadata and UCS
  • In UCS, metadata can facilitate content search
    and retrieval, reuse, and dynamic content
    delivery, because you can determine not only what
    content is, but who uses it, how it will be used,
    how it will be delivered and when
  • In UCS, metadata enables
  • Effective retrieval
  • Systematic reuse
  • Automatic routing based on workflow status
  • Tracking of status
  • Reporting

4
Benefits of Metadata to UCS
  • Reduction of redundant content
  • Authors can easily retrieve existing reusable
    content
  • CM can use metadata to identify multiple versions
    of same content
  • Systematic reuse
  • Improved workflow
  • Metadata for status (ready for review, ready
    for publication)
  • Reduced costs

5
Metadata Examples
6
Card Catalog and Metadata
  • Card catalog identifies what books are in the
    library and where they are physically located
  • Can be searched by subject area, author, or title
    (resource discovery)
  • By showing the author, number of pages,
    publication date, and revision history of each
    book, card catalog helps you determine which
    books will satisfy your needs (resource
    evaluation)
  • Metadata does not need to be digital

7
Data Dictionary and Metadata
  • Metadata repository
  • A centralized repository of information about
    data, such as definitions, relationships, origin,
    domain, usage, and format
  • RDBMS schema

8
Bibliographic Metadata
  • MARC
  • A comprehensive, well-developed, carefully
    controlled scheme intended to be generated by
    professional catalogers for use in libraries
  • Dublin Core
  • An intentionally minimalist standard intended to
    be applied to a wide range of DL materials by
    people who are not trained in library cataloging
  • Refer

9
Metadata in Traditional Library
  • Metadata refers to cataloging or indexing
    information that libraries create to arrange,
    describe, and enhance access to an information
    object
  • Example
  • MARC (MAchine-Readable Cataloging format)
  • LCSH (Library of Congress Subject Headings)
  • Descriptive metadata describe the properties of
    an information object

10
MARC
  • Developed in the late 1960s at the Library of
    Congress
  • Promote the sharing of catalog entries among
    libraries
  • A comprehensive and detailed standard whose use
    is carefully controlled
  • Information includes author, type of material,
    information about the physical material itself,
    publishers, some notes, identifiers,
  • Cataloging is governed by a detailed set of rules
    and guideline called AACR2(R)
  • Internally MARC records are stored as a
    collection of tagged fields in a fairly complex
    format

11
Library Catalog Record
12
MARC Fields in the Record of the Previous Slide
13
Meaning of Some MARC Fields
14
Refer
  • Originally designed by computer scientists for
    use by mainly scientific and technical
    researchers
  • Used in some bibliographic tool like EndNote
  • Format
  • Formatted line by line, and records are separated
    with a blank line
  • Each line starts with a key character, introduced
    by a percent symbol, that signals the kind of
    information the line contains
  • The rest of line contains the data itself
  • No provision for type of bibliographic record
    (journal article) in the original Refer

15
Bibliography Item in Refer Format
16
Basic Keywords Used by Refer
17
Types of Metadata
  • UCS requires three types of metadata
  • Categorization metadata
  • Element metadata

18
Categorization Metadata
  • Information needs to be organized in a logical
    structure, categorized for effective retrieval
  • Use the categories to add metadata to the
    information
  • Metadata is like the old card catalog, presenting
    information to users in context, and enable them
    to quickly find relevant information
  • Metadata hierarchies or metadata taxonomies are
    used to organize the content

19
Metadata Hierarchies
  • Represented as tree structures
  • A hierarchy provides the content user with an
    understanding of how content is organized
  • Content may be organized under multiple
    categories (multiple access points)
  • Content users use hierarchies to retrieve content
    because hierarchies give them multiple paths to
    the same information

20
Metadata Taxonomies
  • Represented as tree structures
  • Content may be categorized in only one place
  • Taxonomies are used by authors to ensure that
    content is categorized in only one way
  • Categorizing content in multiple ways makes it
    difficult to retrieve

21
Categorization Metadata Creation
  • Industry-specific taxonomies vertical taxonomies
  • Industries may also create standards for the
    format, structure, and syntax of metadata to
    enable different organizations and even different
    departments within an organization to share
    metadata
  • Create the categorization metadata yourself
  • Include corporate librarians or information
    architects
  • You need to understand your users ask the
    following questions
  • Who is going to retrieve the content?
  • What tasks are they trying to accomplish the
    content?
  • What terms will they use when retrieving the
    content?

22
Create the Categorization Metadata Yourself
  • Grouping or clustering related content
  • Company benefits
  • Benefit policies
  • Benefit forms
  • Benefit frequently asked questions
  • Company benefits (refined and simplified)
  • Policies
  • Forms
  • Frequently asked questions

23
Create the Categorization Metadata Yourself
(Cont.)
  • Developing your taxonomy
  • As you group content, categorize it, and define
    the terms to be used to identify your content,
    you are automatically creating your taxonomy
  • Each term in your taxonomy becomes metadata
  • Testing your taxonomy to ensure that it is
    appropriate and comprehensive
  • Categorize some sample content and ask users
    (audiences) to perform a usability test

24
Categorization Metadata Standards
  • Dublin Core
  • RDF
  • XMP
  • Crosswalks

25
Element Metadata
  • Element metadata identifies your content at the
    element level, based on the elements defined in
    the information model
  • Authors use element metadata to help them manage
    content throughout the authoring process
  • Three main types of element metadata
  • Reuse metadata
  • Retrieval metadata
  • Tracking metadata

26
Metadata for Reuse
  • Identifies the components of content that can be
    reused in multiple areas
  • Example content typeoverview, productABC
  • Authors can search CM by metadata for reusable
    content
  • CM can automatically search for appropriate
    reusable content (based on models and metadata)
    and deliver it (systematic reuse) to authors

27
Design Metadata for Reuse
  • Need to determine the business result you are
    trying to achieve and build metadata backward to
    achieve the result. Think about the following
  • Where is content going to be reused?
  • Across product? Across information product?
  • You need to create metadata to identify each
    yes (reuse)
  • Product ABC, EFG, HIJ
  • Information product Brochure, Web, Help, User
    guide
  • Metadata such as information product can be
    derived from the template type

28
Design Metadata for Reuse (Cont.)
  • What type of content is it?
  • The element content type for which the content is
    valid
  • Overview, caution, warning, troubleshooting,
    example
  • Metadata such as content type can be derived from
    your model or semantic tags
  • What else do you need to know about the content
    to ensure that the correct piece of content is
    reused?
  • Version 1, 2, 2.5
  • Region United States, Taiwan, Canada
  • Audience consumer, decision maker, technical
    support

29
Metadata for Retrieval
  • Help authors retrieve content
  • May include much or all of the metadata used for
    reuse
  • More extensive than metadata for reuse, providing
    additional information about an element that
    facilitates retrieval
  • Metadata for retrieval examples
  • Title/Subject
  • Author
  • Date (creation, completion, modification)
  • Keywords
  • Security level (who can view the content)

30
Design Metadata for Retrieval
  • Need to determine the business result you are
    trying to achieve and build metadata backward to
    achieve the result. Think about the following
  • Who is going to retrieve your content?
  • Authors fine granularity
  • Users whole documents
  • In what form do they want to retrieve the
    content?
  • Authors metadata that defines the source format
    and the desired format
  • Users metadata that defines the appropriate
    format for the content

31
Design Metadata for Retrieval (Cont.)
  • What permissions should users have for retrieving
    content?
  • Each element, container, and information product
    needs to have appropriate security permissions
    expressed through metadata
  • How are they going to specifically identify the
    desired content?
  • Analyze the terms your authors and users will
    use, then determine what metadata the content
    should carry to enable a match between the search
    and the content
  • Consider adding keywords to metadata to
    facilitate retrieval

32
Metadata for Tracking (status)
  • Useful when you are implementing workflow in UCS
  • Determine which elements are active
  • Control what can be done to an element and who
    can do it
  • Automatically or manually change status metadata
  • Example
  • Content status indicate status of the content
  • Draft, Ready for review, In review, Final, In
    approval, Approved, Published
  • Review status indicate status of the review
    content
  • Accept, Reject

33
Design Metadata for Tracking
  • Need to determine the business result you are
    trying to achieve and build metadata backward to
    achieve the result
  • Design your metadata for tracking after you have
    designed your workflow
  • Identify other metadata that can help you to
    track content
  • Who created the content (author)?
  • When was it created/modified (date)?
  • Who modified the content (editor)?
  • Who reviewed/approved the content
    (reviewer/approver)?
  • How long does it take to create/modify/review
    (time)?
  • Where has it been reused (information product,
    product)?
  • Has it been translated (content status)?

34
Creating a Controlled Vocabulary
  • Metadata needs to be consistent to facilitate
    reuse, retrieval, and tracking ? require a
    controlled vocabulary
  • A controlled vocabulary reconciles all the
    various possible words that can be used to
    identify content and to differentiate among all
    the possible meanings that can be attached to
    certain words
  • Using an unlimited or uncontrolled set of
    metadata terms leads to additional work for
    authors and reduce the percentage of content that
    can be effectively retrieved

35
To Create A Controlled Vocabulary
  • Identify your metadata categories (Content type,
    Product)
  • Identify the terms that make up that metadata
    category
  • Content status (metadata category)
  • Draft, Ready for Review, In review, Final, In
    approval, Approved, Published (controlled terms)
  • If possible, do not provide any metadata that can
    be defined by the author. If that is not
    possible, monitor the uncontrolled metadata terms
    to see whether patterns are emerging that could
    then be used to create a controlled vocabulary

36
Ensuring Metadata Gets Used
  • Whenever possible, automate the application of
    metadata (reduce author burden and inconsistency)
  • Categorization metadata based on content
  • Metadata based on the template and model
  • Inheritance of metadata based on the parent
  • Metadata based on position in the workflow
  • If it is necessary for authors to add metadata,
    make it possible for them to add the metadata as
    they are authoring so that they dont have to
    wait until the content is checked into CM

37
Metadata Sharing
  • Industry-specific standards
  • Consider using RDF to design your metadata
  • Consider using a crosswalk (a table to map
    metadata from one structure to another) to
    provide a metadata interchange

38
Another View of Types of Metadata
39
Types of Metadata
  • Structure metadata The ruling monarch of
    metadata. It precedes most other kinds of
    metadata by creating structural divisions in your
    content.
  • Format metadata Applies to any level of
    structure that you define and marks how you
    intend to render that structure.
  • Access metadata Organizes the structures that
    you create into hierarchies and other access
    structures.
  • Management metadata The data that you attach to
    structures to administer and track it.
  • Inclusion metadata Stands in for external
    content. It marks the place where the external
    content is to go.

40
Structure Metadata
  • Structure metadata says, "You call this stuff . .
    ."
  • Before you can say anything more about something,
    you must name that something
  • Basic structure metadata characters, words,
    paragraphs
  • Elements
  • Collections of characters, words, or paragraphs
    that you intend the reader to take as a unit
    (such as a title).
  • it's the smallest structure that you intend to
    access separately in your system
  • Components
  • Collections of elements that you intend the user
    to take as a whole (such as a white paper).
  • Components are the structures that you intend to
    manage.
  • They're the structures to which you apply
    management and access metadata.

41
Structure Metadata (Cont.)
  • Nodes
  • Collections of components that, after
    publication, you intend the reader to take as a
    unit.
  • On Web sites, nodes are pages. In print
    materials, nodes are sections (headings,
    chapters, parts, and so on)
  • Publications
  • Collections of nodes that you intend readers to
    take as a unit (a single department's intranet
    site, for example).
  • On the Web, you set off publications from each
    other mostly by using graphic conventions and the
    internal navigation conventions of the site. (A
    site may have one or more publications on it).
  • In print, you most often delineate publications
    by using a file boundary.
  • Publication groups
  • Collections of publications that you intend the
    reader to take as a unit (the volumes in an
    encyclopedia, for example).
  • set off on both the Web and in print by the
    formatting conventions and navigational
    structures that you provide for moving between
    publications in the group.

42
Structure Metadata Example
  • ltCOLLECTIONgt   ltPUBgt      ltSECTIONgt       
     ltNODEgt            ltHEADERgt...lt/HEADERgt       
        ltCOMPONENTSgt               ltCOMPONENTgt    
                 ltELEMENTgt                   
     ltPARAgt                        ltCOMPONENTgt...lt/CO
    MPONENTgt                     ltPARAgt          
           lt/ELEMENTgt               lt/COMPONENTgt 
              lt/COMPONENTSgt          
     ltFOOTERgt...lt/FOOTERgt         lt/NODEgt    
     lt/SECTIONgt   lt/PUBgtlt/COLLECTIONgt
  • ltELEMENTgt   ltPARAgt     ?This is my body text,
    and in it I'm embedding an image.      ltMEDIA
    ID"m1" URL"dabw.jpg"gt       
     ltSIZEgt100,300lt/Sizegt         ltCAPTIONgtThis is a
    separate componentlt/CAPTIONgt      lt/MEDIAgt 
     lt/PARAgt  ?Normal things seem strange if you
    really think about them!lt/ELEMENTgt

43
Format Metadata
  • Format metadata says, "Here's how to render the
    stuff that I surround."
  • Format metadata can apply to any level of
    structure in your system.
  • In many cases, the structural tags themselves are
    what you interpret and turn into
    platform-specific formatting metadata
  • ltSECTION LEVEL"1"gtSome Sectionlt/SECTIONgt
    ?ltH1gtSome Sectionlt/H1gt

44
Format Metadata Example
  • ltCOLLECTIONgt   ltPUB DISPLAY"child"gt    
     ltSECTIONgt         ltNODEgt          
     ltHEADERgt...lt/HEADERgt            ltCOMPONENTS
    LAYOUT"table"gt               ltCOMPONENTgt    
                 ltELEMENT TYPEFACE"Arial"gt       
                 ltPARA STYLE"body"gt                
       ?Some ltFORMATTAGgttextlt/FORMATTAGgt          
                 ltCOMPONENTgt...lt/COMPONENTgt       
                 ltPARAgt                
     lt/ELEMENTgt               lt/COMPONENTgt       
        lt/COMPONENTSgt            ltFOOTERgt...lt/FOOTERgt
             lt/NODEgt      lt/SECTIONgt 
     lt/PUBgtlt/COLLECTIONgt

Usually in a template and nota content structure
45
Access Metadata
  • Access metadata says, "Here is how this structure
    fits in with the rest."
  • you most often use it to gain access to the
    content.
  • You can store access metadata within a component
    or outside it in a separate place.
  • The types of access metadata correspond to the
    types of access structures hierarchy, index,
    associations, and sequences.

46
Access Metadata Example
  • ltCOLLECTIONgt   ltPUBgt      ltSECTIONgt       
     ltNODE KEYWORDS"rollup" gt          
     ltHEADERgt...lt/HEADERgt            ltCOMPONENTSgt 
                 ltCOMPONENT INDEX"term1,term2,term3"
    gt                  ltELEMENTgt                   
     ltPARAgt                     ltCOMPONENTgt...lt/COMPO
    NENTgt                        ltLINK TARGET"C123"
    gtFor more info, seelt/LINKgt                   
     ltPARAgt                  lt/ELEMENTgt          
        lt/COMPONENTgt            lt/COMPONENTSgt    
           ltFOOTERgt...lt/FOOTERgt         lt/NODEgt    
     lt/SECTIONgt   lt/PUBgtlt/COLLECTIONgt

For more information, ltA HREFgtlt/Agt For more
information, see Links in Chap. 5
47
Access Metadata (Cont.)
  • Access metadata is as often outside the content
    structure as it is inside.
  • Instead of typing the terms into the component,
    you want to type the component into the terms

ltINDEXgt   ltTERMgt      ltNAMEgtNOAAlt/NAMEgt    
 ltCOMPONENTSgtC123,C456,C789lt/COMPONENTSgt 
 lt/TERMgtlt/INDEXgt
48
Management Metadata
  • Management metadata is there to help you keep
    track of and administrate content.
  • ID, Title, Author, Create data, Modify date,
    Status, Size, Owner, Publish date, Expire date
  • Management metadata isn't always only for
    management.
  • Any of the types listed here you can just as
    easily consider as content to publish as well as
    data to help manage the content to publish.
  • Whether or not you show the values of these
    metadata elements to your audience, their use to
    you is the same, to help you keep track of and
    administrate your content.

49
Management Metadata Example
  • ltCOLLECTIONgt ltPUB ID"p1"gt      ltSECTION
    ID"s1"gt         ltNODE ID"n1"gt          
     ltHEADERgt...lt/HEADERgt            ltCOMPONENTSgt 
                 ltCOMPONENT ID"C123"gt             
        ltTITLEgtlt/TITLEgt                  ltADMINgt 
                       ltOWNERgtO234lt/OWNERgt          
              ltCREATEgt9/23/01lt/CREATEgt             
           ltMODIFYgt9/30/01lt/MODIFYgt                
        ltSTATUSgtStatus1lt/STATUSgt                
     lt/ADMINgt                  ltELEMENT
    NAME"intro"gt                     ltPARA
    ID"p1"gt...lt/PARAgt                  lt/ELEMENTgt 
                 lt/COMPONENTgt          
     lt/COMPONENTSgt            ltFOOTER...lt/FOOTERgt 
           lt/NODEgt      lt/SECTIONgt   lt/PUBgt
    lt/COLLECTIONgt

50
Inclusion Metadata
  • Inclusion metadata says, "Put the following
    external entity here."
  • It enables you to reference content that isn't
    physically in the content structure.

ltELEMENTgt   ltPARAgt     ?This is my body text,
and in it I'm embedding an image.      ltMEDIA
ID"m1" URL"dabw.jpg"gt       
 ltSIZEgt100,300lt/SIZEgt         ltCAPTIONgtThis is a
separate componentlt/CAPTIONgt      lt/MEDIAgt 
 lt/PARAgt  ?Normal things seem strange if you
really think about them!lt/ELEMENTgt
51
Inclusion Metadata Example
ltELEMENTgt   ltPARAgt     ?This is my body text,
and in it I'm embedding an image.      ltMEDIA
ID"m1" URL"dabw.jpg"gt       
 ltSIZEgt100,300lt/SIZEgt         ltCAPTIONgtThis is a
separate componentlt/CAPTIONgt      lt/MEDIAgt 
 lt/PARAgt  ?Normal things seem strange if you
really think about them!lt/ELEMENTgt
  • Shortcomings
  • This is HTML
  • The image and its caption are locked in this
    location
  • The reference may break
  • You have no place to put other info that you may
    need

52
Inclusion Metadata (Cont.)
  • If you really intend to make the ltMEDIAgt element
    a separate component, you're better off not
    directly embedding it in another component by
    pointing to its URL but instead referencing it
    there based on its ID
  • The m1 component is stored with the other "m"
    components where you can more easily find,
    manage, and include it in other places in the
    content structure

ltELEMENTgt    ltPARAgt      ?This is my body text,
and in it I'm embedding an image.       ltINCLUDE
REFID"m1"gt      ?Normal things seem strange if
you really think about them!   lt/PARAgt
lt/ELEMENTgt
Write a Comment
User Comments (0)
About PowerShow.com