Repurposing Documents Into Semantic Web Services and Networks - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Repurposing Documents Into Semantic Web Services and Networks

Description:

... to ask yourself and others are (1) How 'smart' is your data and information? ... Some key goals are then: (1) 'smarter data' (put more effort into the data than ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 69
Provided by: Niem
Category:

less

Transcript and Presenter's Notes

Title: Repurposing Documents Into Semantic Web Services and Networks


1
Repurposing Documents Into Semantic Web Services
and Networks
  • Web Site Content Management for Government
    Conference
  • Brand Niemann
  • Computer Scientist and XML Web Services
    Specialist
  • U.S. Environmental Protection Agency
  • November 17-19, 2003
  • Doubletree Hotel, Arlington, VA

2
Abstract
  • In this age of eGovernment and Enterprise
    Architecture whose objectives are increased
    collaboration, consolidation, and integration to
    transform from organization- to
    citizen/customer-centric, some key questions to
    ask yourself and others are (1) How smart is
    your data and information? and (2) Who are you
    collaborating with? Some key goals are then (1)
    smarter data (put more effort into the data
    than the applications) and (2) more collaboration
    (on and by means of smarter data). Indeed the
    goal of the CIO Council's Emerging Technology
    Subcommittee is to flatten the Gartner Hype
    Cycle 2003 of adoption and implementation for
    selected emerging technologies like the Semantic
    Web and External Web Services (outside the
    firewall).
  • The XML Web Services Working Group has engaged in
    a series of pilot projects to promote the
    aforementioned objectives and goals including the
    development of methods for repurposing documents
    into semantic Web Services and networks for
    building knowledge-centric communities for
    environmental information and for implementing
    component-based government enterprise
    architectures. The methods involve extracting
    Information object types from documents by saying
    "what is this information trying to accomplish?"
    and how should it be organized into a taxonomy,
    and ultimately an ontology, and will be
    demonstrated in the presentation.

3
Overview
  • 1. The CIO Council's Emerging Technology
    Subcommittee.
  • 2. Open Collaboration with Open Standards.
  • 3. XML Markup and Metadata and Semantic Web
    Services.
  • 4. Repurposing Documents Into Semantic Web
    Services and Networks.
  • 5. Conclusions.
  • 6. Questions and Answers.

4
1. The CIO Council's Emerging Technology
Subcommittee
  • 1.1 Key Messages.
  • 1.2 The Emerging Technology Life Cycle.

5
1.1 The CIO Council's Emerging Technology
Subcommittees Key Messages
  • Supports Federal Agencies and partners as they
    assess new technologies because
  • Organizations have limited capacity of expertise
    and resources and
  • Individualized vendors marketing to multiple
    agencies is not cost-effective nor possible for
    new, small innovative companies.
  • Provide intergovernmental process for pilot
    projects and technology assessment initiatives in
    support of
  • Vendor clearinghouse
  • Government-wide reusable components and
  • Federal and intergovernmental lines of business.

6
1.2 The Emerging Technology Life Cycle
Note Our purpose is to try flatten the curve.
Visibility
Five to 10 years Less than two years
Semantic Web
Web-Services-Enabled Business Models
External Web Services Deployments
Extensible Business Reporting Language
Internal Web Services
Note Non-Web Services omitted.
Maturity
Technology Trigger
Peak of Inflated Expectations
Trough of Disillusionment
Slope of Enlightenment
Plateau of Productivity
Source Gartner as of July 2003.
7
1.2 The Emerging Technology Life Cycle
  • Gartner Hype Cycle 2003 Definitions
  • Technology Trigger A breakthrough, public
    demonstration, product launch or other event
    generates significant press and industry
    interest.
  • Peak of Inflated Expectations During this phase
    of over-enthusiasm and unrealistic projections, a
    flurry of well-publicized activity by technology
    leaders results in some successes, but more
    failures, as the technology is pushed to its
    limits. The only enterprises making money are
    conference managers and magazine publishers.
  • Trough of Disillusionment Because the technology
    does not live up to its over-inflated
    expectations, it rapidly becomes unfashionable.
    Media interest wanes, except for a few cautionary
    tales.
  • Slope of Enlightenment Focused implementation
    and solid hard work by an increasingly diverse
    range of organizations leads to a true
    understanding of the technologies applicability,
    risks and benefits. Commercial, off-the-shelf
    methodologies and tools ease the development
    process.
  • Plateau of Productivity The real-world benefits
    of the technology are demonstrated and accepted.
    Tools and methodologies are increasingly stable
    as they enter their second and third generations.
    The final height of the plateau varies according
    to whether the technology is broadly applicable
    or benefits only a niche market. Approximately 30
    percent of the technologys target audience has
    or is adopting the technology as it enters the
    Plateau.
  • Time to Plateau/Adoption Speed The time required
    for the technology to reach the Plateau of
    Productivity.

8
1.2 The Emerging Technology Life Cycle
  • Hype Cycle for Government Technologies, 2003
    (Gartner Strategic Analysis report, June 13,
    2003)
  • Semantic Web
  • Definition Extends the World Wide Web through
    semantic markup languages such as Resource
    Description Framework (RDF), Web Ontology
    Language (OWL), and Topic Maps that describes
    entities and their relationships in the
    underlying document (see Innovative Approaches
    for Improving Information Supply, M-14-3517).
  • Time to Plateau/Adoption Speed Five to 10 Years.
  • Justification for Hype Cycle Position/Adoption
    Speed So far, there has been little deployment
    of the Semantic Web and there is a significant
    skill shortage.
  • Business Impact Areas Can affect the management
    of public sector information. Can provide
    breakthroughs to make the most of government
    metadata modeling.
  • Analysis By Alex Linden.

9
2. Open Collaboration with Open Standards
  • 2.1 Pursuing a Vision of Live Publishing on the
    Web.
  • 2.2 The XML Web Services Content Authoring Pilot.
  • 2.3 Emerging Components at Componenttechnology.org
    .

10
2.1 Pursuing a Vision of Live Publishing on the
Web
  • Late 1980s
  • US EPAs Center for Environmental Statistics
  • Guide to Selected National Environmental
    Statistics in the U.S. Government.
  • Both a print and a hypertext versions before the
    Web.
  • The text could be readily updated, but the
    graphics couldnt.
  • Mid-1990s
  • Interagency Working Group on Sustainable
    Development Indicators
  • Reports on Sustainable Development Indicators and
    A Digital Library of the State of the
    Environment.
  • Both a print version and a hypertext version on
    the Web.
  • The text could be readily updated, but the
    graphics still couldnt.
  • Mid-2002
  • XML Working Group (July 17, 2002, John Turnbull,
    Corel)
  • Corels XMetal and Smart Graphics Studio.
  • Both the text and the graphics could be readily
    updated and it used XML (SVG)!

11
2.2 The XML Web Services Content Authoring Pilot
  • October 29, 2002, The Promise of XML Web
    Services for Government, FedWeb Fall 02, George
    Mason University, Arlington, VA
  • Corels XMetal (Jay Di Silvestri), XyEnterprises
    Content_at_, and NextPages Triad.
  • March 13, 2003, "Bringing XML Web Services to
    Your Agency" The CIO Council's XML Web Services
    Working Group and Some Examples, Corel Smart
    Graphics Studio and XMetal, Workshop for the USDA
    Economic Research Service
  • Scott Edwards, Jim Buttinger, Mary Romeo, and
    Shawn Henderson.

12
2.2 The XML Web Services Content Authoring Pilot
Change the table and the graph changes. . .
13
2.2 The XML Web Services Content Authoring Pilot
XML Web Services Repository and Distributed
Content Network
Now
NXT 4
Smart Graphics Studio
XPP Web Services
Then
XyEnterprises Content_at_
NextPages NXT 3 and Solo
Corels XMetal
Multiple vendors providing an end-to-end solution
based on XML standards.
14
2.2 The XML Web Services Content Authoring Pilot
  • March 21, 2003, Corel's XMetal and Smart Graphics
    Studio, 2003 FOSE Best New Technology Award
    Finalists in the Electronic Government Software
    Category
  • Winner for Smart Graphics Studio!
  • May 14, 2003, XML Web Services Working Group
    Meeting, XML Web Services Content Authoring the
    State of the Chesapeake Bay Report
  • Corel's XMetal and Smart Graphics Studio.
  • May 21, 2003, Information Management Subcommittee
    Meeting of the Chesapeake Bay Program, XML
    Authoring the State of the Chesapeake Bay Report
  • Corels' XMetal and Smart Graphic Studio.
  • September 29, 2003, XML Authoring and Editing
    Forum
  • John Turnbull, Jay Di Silvestri, and Bill Kirk.

15
2.2 The XML Web Services Content Authoring Pilot
  • September 29, 2003, XML Authoring and Editing
    Forum
  • Taking the Pulse of XML Editing, Kendall Grant
    Clark, XML.Com, October 1, 2003
  • If I had a group of end users who needed to do
    lots of stuff with SVG and XML creation, Id
    probably give them XMetal, and the Corel graphics
    tool, on that basis alone.
  • August 28, 2003, Information Discovery and Data
    Exploitation
  • Technical Exchange Meeting Report of the
    Intelligence Community Metadata Working Group (IC
    MWG), October 17, 2003
  • In the best of all worlds, XML and metadata
    should be embedded in the resource, but we will
    take it any way we can if it gets everyone to
    share their information, to improve search,
    discovery, and exploitation.
  • The use of well-designed GUIs for XML and
    metadata insertion has the potential to ease the
    burden of placing metadata and ultimately
    increasing productivity downstream.

16
(No Transcript)
17
2.3 Emerging Components at Componenttechnology.org
  • Workforce Connections Background
  • A Web application sponsored by the U.S.
    Department of Labor that a non-technical user can
    use to manage Web content as objects that can be
    shared, modified, and restyled for traditional
    Web sites, on-line learning, and communities of
    practice.
  • ADL SCORM Version 1.2 conformant and ADA Section
    508 compliant.
  • Same administrative interface, different public
    looks.
  • Supports the ilities portability, reusability,
    interoperability, sharability, and
    maintainability.
  • Scalable, high-performance, and high availability
    for public Web sites, Intranets, or highly secure
    environments.
  • Ongoing effort based on components that will be
    continually improved through collaboration.
  • Open collaboration with open standards approach!

18
2.3 Emerging Components at Componenttechnology.org
  • Workforce Connections Standards
  • ADL SCORM Version 1.2 conformant
  • Advanced Distance Learning, Sharable Content
    Object Reference Model
  • XML for metadata about the content and export of
    content in XML with an XML Schema
  • http//www.adlnet.org/index.cfm?fuseactionscormab
    t
  • RDF support allows one to add an RSS-compatible
    newsfeed from someone elses Web site.
  • ADA Section 508 compliant
  • Americans with Disabilities Act requires that
    Federal agencies' electronic and information
    technology is accessible to people with
    disabilities
  • http//www.section508.gov

19
2.3 Emerging Components at Componenttechnology.org
20
2.3 Emerging Components at Componenttechnology.org
  • Workforce Connections General Concepts
  • MetaSite
  • An installation of WFC with a Content Repository
    to store the Content Objects which can be shared
    between multiple SubSites.
  • SubSite
  • The Tools (Folder, Pages, and Content Objects)
    for User (s) to create a Web site.
  • Users
  • Each member of the content team has a
    authentication password and roles that control
    access to Content Objects.
  • Content Objects
  • Chunks of content that can be organized on a
    page and reused elsewhere in a site. Currently
    Paragraphs, Instructions, Multiple Choice
    Questions, Matching Questions, and Syndication
    Sources. Paragraphs support Text, Images, Audio,
    Video, and Flash Objects.

See http//ezro.devis.com
21
2.3 Emerging Components at Componenttechnology.org
http//www.onestopcoach.org/
http//www.disabilityinfo.gov/
22
3. XML Markup and Metadata and Semantic Web
Services
  • 3.1 The Benefits of Structured Content Chunking
    a Press Release.
  • 3.2 A Simple Example of the Benefits of XML
    Searching for Information.
  • 3.3 Clustering Search Engine Vivisimo
  • 3.4 RDF Dublin Core Metadata and Relationships.
  • 3.5 Semantic Web Services Definition and Process.

23
3.1 The Benefits of Structured Content Chunking
a Press Release
Contact
-ltRELEASEgt ltDategt lt/Dategt - ltHeadinggt
ltHeadlinegt lt/Headlinegt ltSubheadgt
lt/Subheadgt lt/Headinggt - ltContactgt
ltNamegt lt/Namegt ltTitlegt lt/Titlegt
ltPhonegt lt/Phonegt ltEmailgt lt/Emailgt
lt/Contactgt - ltBodygt ltPara1gt lt/Para1gt
ltMainBodygt lt/MainBodygt ltClosingParagt
lt/ColsingParagt lt/Bodygt lt/RELEASE/gt
Lynn Cheryan Production Director Tel
301-495-7345 x122 icheryan_at_dev.com
Headline
IDEV Redesigns Web Site For Industry Group
Purchasing Association.
Subhead
Redesigned Web Site intended to Expand the
Resources of the Association Staff.
First Paragraph
IDEV, a full-service Web development and
consulting agency in metro-D.C. Launched the
redesign of the
Tony Byrne, The Siren Song of Structure Heeding
the Call of Reusability, EContent, September 2002.
24
3.2 A Simple Example of the Benefits of
XMLSearching for Information
  • Most services are invoked by inputting data into
    HTML forms and sending the data to the service,
    embedded within a URL string to match the given
    text strings to catalogued HTML pages
  • http//www.google.com/search?qSkatebootsbtnGGo
    ogleSearch
  • XML is a better way to send the data
  • ltSOAP-ENVBodygt
  • ltsSearchRequest xmlnsswww.xmlbus.com/SearchServ
    icegt
  • ltp1gtSkatelt/p1gt
  • ltp2gtbootslt/p2gt
  • ltp3gtsize 7.5lt/p3gt
  • lt/sSearchRequestgt
  • lt/SOAP-ENVBodygt

Eric Newcomer, 2002 Understanding Web Services,
Addison-Wesley, pp. 4-5.
Next XQuery with native XML databases!
25
3.3 Clustering Search Engine Vivisimo
  • Information Overlook imposes high opportunity
    costs.
  • Alleviate by showing organized info.
  • Clustering into folders is a natural approach
  • Hard to do well, but now a solved problem using
    an algorithm augmented by an ontology.
  • Uses title, snippet, and (optionally) meta-tags.
    No taxonomy-building headaches.
  • Overlays any search engine (Google, Autonomy,
    Verity, etc.)
  • Advanced management features support XML
    standards.
  • Best meta-search site last 2 years in row (Search
    Engine Watch).
  • See http//vivisimo.com/gov and
    http//vivisimo.com/demo/Clustering_Engine_Demos/G
    overnment.html

26
3.4 RDF Dublin Core Metadata and Relationships
  • The Resource Description Framework (RDF) is an
    XML-based language to describe resources and is
    designed to create meta data about the resource
    as a standalone entity. The RDF model is often
    called a triple because it has three parts (1)
    a resource (2) a resources properties and (3)
    the property values.
  • The knowledge representation community uses the
    grammatical parts of a sentence (1) subject (2)
    predicate and (3) object.
  • RDF Schema is language layer on top of RDF in
    what is called the Semantic Web Stack. Above
    RDF Schema is Ontologies and above that is the
    third and final web in Tim Berners-Lees three
    part vision (collaborative web, Semantic Web, web
    of trust).
  • XML Topic Maps are popular implementations of
    taxonomies and have complimentary characteristics
    to RDF.
  • Three excellent resources are
  • Practical RDF Solving Problems with the Resource
    Description Framework, Shelley Powers, OReilly,
    July 2003.
  • The Semantic Web A Guide to the Future of XML,
    Web Services, and Knowledge Management, Wiley
    Technology Publishing, June 2003 and
  • XML Topic Maps Creating and Using Topic Maps for
    the Web, Addison Wesley, July 2002.

27
3.4 RDF Dublin Core Metadata and Relationships
Key Ontology Components
RDF Triple Components
The company sells batteries.
depiction
Image
knows
Person birthdate date Gender char
Object
Predicate
published
Subject
Resource
Predicate
Literal
works for
is-A
leads
Leader
Organization
URI
Literal
Source The Semantic Web A Guide to the Future
of XML, Web Services, and Knowledge Management,
Wiley Technology Publishing, June 2003
Property or Association
28
3.4 RDF Dublin Core Metadata and Relationships
RDF Semantic links - "Joining the Web"
Source Standards, Semantics and Survival, by Tim
Berners-Lee, Director, World Wide Web
Consortium, January 2003.
29
3.4 RDF Dublin Core Metadata and Relationships
  • The Dublin Core Metadata Initiative is a
    cross-disciplinary international effort to
    develop mechanisms for the discovery-oriented
    description of diverse resources in an electronic
    environment. The Dublin Core Element Set is a
    list of fifteen fixed elements that capture a
    representation of essential aspects related to
    the description of resources. A complete list of
    Dublin Core metadata elements (e.g. author,
    title, creation date, etc.) can be found at
    http//dublincore.org/documents/1999/07/02/dces/
  • Metadata can exist within the resource that it is
    describing (internal metadata), or it can exist
    in a separate file (external metadata) that is
    associated with the content file.
  • NXT 3 (recall slide 13) capitalizes on the
    following aspects of RDF
  • When you use Manage Content to assign metadata to
    a file, NXT 3 places this metadata in an
    associated RDF file.
  • RDF makes it possible for NXT 3 to specify
    semantics for data based on XML in a
    standardized, interoperable manner.
  • RDF facilitates resource discovery, providing NXT
    3 with better search engine capabilities.
  • RDF facilitates cataloging of both content
    descriptions and content relationships available
    at a particular Web site, page, or digital
    library.

30
3.4 RDF Dublin Core Metadata and Relationships
Manage Content in NextPages NXT 3
31
3.5 Semantic Web Services Definition and Process
  • The Semantic Web is a machine-readable web of
    smart data and automated services that amplify
    the Web far beyond current capabilities.
  • Smart data is data that is application-independent
    , composable, classified, and part of a larger
    information ecosystem (ontology).
  • XML provides a simple, yet robust mechanism for
    encoding semantic information, or the meaning of
    data and shifts the power from the application
    to the data.
  • But simple XML metadata is not enough because it
    only provides syntactic interoperability.
  • Additional XML-based Ontology languages are being
    developed to encode semantic interoperability.
  • In the next ten years, we will see semantics to
    describe problems and business processes in
    specialized domains.

32
3.5 Semantic Web Services Definition and Process
Source Derived in part from two separate
presentations at the Web Services One Conference
2002 by Dieter Fensel and Dragan Sretenovic.
Corporate Ontology and Web Services Registry
Dynamic Resources
Semantic Web Services
Web Services
Static Resources
WWW
Semantic Web
Interoperable Syntax
Interoperable Semantics
33
3.5 Semantic Web Services Definition and Process
Knowledge Process in a Typical Organization
Not Saved
Source The Semantic Web A Guide to the Future
of XML, Web Services, and Knowledge Management,
Wiley Technology Publishing, June 2003
R.I.P.
1. Capture
Lost Data
2. Production
Stovepiped Systems
?
??
4. Discovery
3. Integration
Searches
New, Expensive Stovepipes
Manual Analysis of All Information
Stored for Later Retrieval?
5. Application
???
Report
R.I.P.
Collaborative Report Writing
Lost Data
34
3.5 Semantic Web Services Definition and Process
Source The Semantic Web A Guide to the Future
of XML, Web Services, and Knowledge Management,
Wiley Technology Publishing, June 2003.
35
3.5 Semantic Web Services Definition and Process
  • The Knowledge-Centric Organization Roadmap
  • Prepare for Change
  • E.g., Pick a core team that will help communicate
    the vision.
  • Begin Learning
  • E.g., Get your technical staff started.
  • Create Your Organizations Strategy
  • E.g., Markup your documents in XML, expose your
    applications as Web Services, etc.
  • Move Out!
  • With an intelligent plan and an incremental
    process, it is extremely doable, and it will be
    worth it when you get there.

Source The Semantic Web A Guide to the Future
of XML, Web Services, and Knowledge Management,
Wiley Technology Publishing, June 2003
36
3.5 Semantic Web Services Definition and Process
  • An EPA Enterprise Integration Portal/Data
    Exchange Network Pilot
  • Organize selected EPA Web site content along
    subject lines that make sense to our audiences
    instead of along internal organizational
    boundaries.
  • Integrate four basic file types with XML
    indexing proprietary (e.g. PDF, Word, etc.),
    Web, relational databases, and XML.
  • Provide for both centralized and distributed
    (e.g. C2G) content management.
  • Provide for different levels of XML authoring
    (e.g. tagless, tag-lite, and tag-rich).
  • Using NXT3
  • Virtually connects the distributed information
    sources and makes them appear integrated to the
    user. Unlike syndication, in which content is
    copied and integrated with other content locally,
    NXT3 keeps objects where they are.
  • Uses the standard simple object access protocol
    (SOAP) to exchange and normalize information
    between local content directories, assembling
    meta-indexes so that users can search or
    manipulate content transparently, regardless of
    physical location.

37
3.5 Semantic Web Services Definition and Process
Environmental Topics
Regional and State Data
38
3.5 Semantic Web Services Definition and Process
Advanced Search Across Part or All of the
Integrated Content Network
39
4. Repurposing Documents Into Semantic Web
Services and Networks
  • 4.1 Americas Children and the Environment
    Report, 2003.
  • 4.2 Repurposing Suggestions.
  • 4.3 Initial Repurposing Results.
  • 4.4 Conversion to XML.
  • 4.5 Semantic Web Services Network Portal.
  • 4.6 Some Next Steps.
  • 4.7 NCES Semantic Web Prototypes.

40
4.1 Americas Children and the Environment
Report, 2003
  • The report consists of 171 pages of text, tables,
    graphics, references, etc. and exists in two
    basic forms
  • A 2 MB PDF file and
  • A new HTML version (see next slide).
  • This document was converted to XML by several
    tools but the automated conversion was
    practically worthless from a semantic point of
    view.
  • This single document covers so much information
    that it will benefit immensely from semantic
    dissection, linking, and augmentation (explosion
    of single PDF file to multiple XML files stored
    in an XML database for reuse).

41
4.1 Americas Children and the Environment
Report, 2003
http//www.epa.gov/envirohealth/children/
http//www.epa.gov/envirohealth/children/ace_2003.
pdf
42
4.2 Repurposing Suggestions
  • Some of the types of XML Information Object
    Documents that individual paragraphs could be
    converted to
  • 1. Document Structure Table of Contents, Index,
    Title, etc.).
  • 2. Finding A short fact the document asserts as
    true possibly through empirical evidence.
  • 3. Instruction A tutorial on a topic.
  • 4. Terminology Definition A definition of a
    term.
  • 5. Definition Example A specific instance that
    illustrates a definition is accurate and true.
  • 6. Process Definition A description of a
    sequence of steps that causes an effect.
  • All these information object types were extracted
    from the document by looking at each paragraph
    and saying, "what is this information trying to
    accomplish?".

43
4.2 Repurposing Suggestions
  • There will be more types and we are getting at
    the pragmatic use of the document
  • For example, someone could just get the chain of
    information objects on a single finding (in other
    words, the finding and everything that supports
    the finding).
  • Of course all of these will be derived from an
    information object so all of them could be
    assembled back into a single document.

44
4.2 Repurposing Suggestions
  • So, how is this implemented ...
  • Each XML Information Object would be a separate
    XML document in an XML database.
  • Also, the inverse of this is that eventually some
    sort of digital production workbench could assist
    people in authoring these specific types of
    information objects (to author a finding, you
    need X, Y, and Z) to add a new term you need ...

45
4.2 Repurposing Suggestions
  • So, how is this implemented ... (continued)
  • Of course, the XML must also be designed to be
    "RDF-friendly so it can easily be linked to an
    Ontology. And not just one ontology.  Besides an
    ontology about the content, you want an ontology
    for pragmatics.  In other words, an ontology that
    attempts to map user behaviors to the Information
    Objects.  For example
  • User in California --gt may want --gt Document Info
    about California or generically ... User in
    Location(x) --gt wants --gt Info About(x).
  • Of course, this assumes we can get the user's
    location.  We could also expand or narrow
    Location(x) using a location ontology.  May be
    nothing on City, but something on State ... So we
    need to know that a
  • city --gt partOf --gt State.Or ...
  • Scientist specializing in X who looks at Table Y
    --gt needs to know --gtX Machine specs that
    generated Table Y.

46
4.3 Initial Repurposing Results
  • 4.3.1. Document Structure Table of Contents,
    Index, Title, etc.
  • 4.3.2. Finding A short fact the document asserts
    as true possibly through empirical evidence.
  • 4.3.3. Instruction A tutorial on a topic.
  • 4.3.4. Terminology Definition A definition of a
    term.
  • 4.3.5. Definition Example A specific instance
    that illustrates a definition is accurate and
    true.
  • 4.3.6. Process Definition A description of a
    sequence of steps that causes an effect.
  • 4.3.7. Preservation of Structure Data tables can
    be reused in spreadsheets, etc.
  • 4.3.8. Annotation Support Add new information.

47
4.3.1 Document StructureTable of Contents,
Index, Title, etc.
48
4.3.2 Finding A short fact the document asserts
as true possibly through empirical evidence.
49
4.3.2 Finding A short fact the document asserts
as true possibly through empirical evidence.
50
4.3.3 Instruction A tutorial on a topic.
51
4.3.4 Terminology DefinitionA definition of a
term.
52
4.3.5 Definition Example A specific instance
that illustrates a definition is accurate and
true.
53
4.3.5 Definition Example A specific instance
that illustrates a definition is accurate and
true.
54
4.3.6 Process Definition A description of a
sequence of steps that causes an effect.
55
4.3.7 Preservation of Structure Data tables can
be reused in spreadsheets, etc.
56
4.3.7 Preservation of StructureData tables can
be reused in spreadsheets, etc.
57
4.3.8 Annotation Support Add new information.
58
4.4 Conversion to XML
http//www.nextpage.com/publishing/folio/folio-xml
.htm
59
4.4 Conversion to XML
  • The Folio-to-XML Converter creates XML output
    from Folio Infobases. The XML is a true
    representation of the Folio content, using a
    Folio-specific schema. Additional transformations
    to convert the XML to another schema may be
    applied with the assistance of Product Services
    or a NextPage Partner.
  • Additionally, the Folio-to-XML Converter creates
    a CSS style sheet that mimics the formatting of
    the infobase. Due to limitations of CSS, there
    are some formatting options that do not render
    the same in the browser as in Folio Views.
  • Also included with the Folio-to-XML Converter are
    XSL style sheets that apply the CSS formatting to
    the XML. These style sheets handle all of the
    Folio-style links, objects, tables and general
    presentation effects.
  • Finally, the Folio-to-XML Converter creates an
    NXT-compatible .mak file for creating an NXT
    content collection directly from the Folio-to-XML
    Converter output.

60
4.4 Conversion to XML
  • Epace_2003.css (1.13 KB)
  • Epace_2003.txt (1.04 KB)
  • Epace_2003.mak (36.7 KB)
  • Epace_2003.nfo (1.42 MB)
  • Epace_2003.nxt (5.05 MB)
  • Images (2.64 MB)
  • 33 .jpeg
  • 33 .rdf
  • XML (102 files/1.65 MB)
  • 12 sections
  • Total (173 files/15 folders/10.8 MB)

61
4.4 Conversion to XML
  • Resources
  • Webinars
  • http//www.nextpage.com/news/events/index.htm
  • NextPage Knowledgebase and Product Documentation
  • http//support.nextpage.com/
  • http//docs.nextpage.com

62
4.5 Semantic Web Services Network Portal
Portal http//www.sdi.gov/
Basic Document
63
4.5 Semantic Web Services Network Portal
Related Documents
64
4.5 Semantic Web Services Network Portal
Search, Discovery, and Exploitation Across All
Documents
65
4.6 Some Next Steps
  • A semantically-enriched Childrens Health Portal
    that delivers
  • More relevant search and natural language
    queries.
  • Concept syndication.
  • Browseable taxonomy.
  • Related links.
  • Tailored views to specific user categories.
  • Specific question answering support.
  • Annotation support.
  • Topic map navigation and classification.
  • Core ontology and expert system.
  • Individual portlets delivered to a specific user
    community.

66
4.7 NCES Semantic Web Prototypes
  • 1. Military Language Understanding (MLU)
  • The ability to refine natural language queries by
    understanding the military context of the query.
    For example, understanding a Basic Encyclopedia
    (BE ), understanding lat/longs, understanding
    military jargon, etc. This technology currently
    uses WordNet, acronym databases, and the Center
    for Army Leadership Lessons-learned (CALL)
    thesaurus to do the query refinement.
  • 2. Intelligent Federated Index Search (IFIS)
  • A set of standards (web services) and system that
    uses a knowledge base and automated reasoning to
    route queries (received and refined from the MLU)
    to the appropriate index (registry or content
    repository), consolidate the results, and return
    them to the user.
  • 3. VKB Product annotation portlet
  • A portlet (portal component) that allows the
    annotation of web pages and stores the
    annotations in RDF (leverages technology from the
    W3C annotate a project).
  • 4. Product Multi-tagging
  • A portlet that will allow a document to be marked
    up in multiple markup languages simultaneously
    where each markup language is treated as an
    "overlay" on top of the content. This will allow
    multiple perspectives on a single piece of
    content while allowing things currently
    impossible in a single XML document like
    interlaced tags. Additionally, one layer of the
    multi-tagged documents will allow RDF assertions
    on the content to enable formal relationships
    between layers and general assertions on the
    content.

Network Centric Enterprise Services (NCES),
Michael Daconta, MacDonald Bradley. http//www.don
-ebusiness.navsup.navy.mil/servlet/page?_pageid82
3_dadportal30_schemaPORTAL30
67
5. Conclusions
  • There is an intergovernmental process for pilot
    projects and emerging technology assessments in
    support of the CIO Council.
  • Open collaboration with open standards has
    produced Government-wide reusable components.
  • XML markup and metadata and Semantic Web Services
    improve search, discovery, and exploitation and
    help build knowledge-centric organizations and
    communities.
  • Repurposing documents into Semantic Web Services
    and Networks is a new paradigm and methods and
    tools are still evolving.

68
6. Questions and Answers
  • Brand Niemann, Ph.D.
  • Computer Scientist and XML and Semantic Web
    Services Specialist.
  • Office of Environmental Information.
  • U.S. Environmental Protection Agency.
  • 14th Constitution Avenue, NW.
  • EPA West, Room 5219.
  • Washington, DC 20460
  • http//www.sdi.gov
  • http//web-services.gov
  • http//componenttechnology.org
Write a Comment
User Comments (0)
About PowerShow.com