Building an Integrated Information Service a strategic initiative - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Building an Integrated Information Service a strategic initiative

Description:

Stuff: everything from raw instrument data to video clips to compiled human analysis ... Mustn't shock the system. Human, financial, or in-situ services / data ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 28
Provided by: andrew241
Category:

less

Transcript and Presenter's Notes

Title: Building an Integrated Information Service a strategic initiative


1
Building an Integrated Information Servicea
strategic initiative
2
Agenda
  • Introductions
  • Problem Definition
  • Approach
  • Questions (jeers/cheers)

3
The Big Problem
  • It is increasingly more difficult to find our
    stuff.
  • Its hard to find your stuff
  • Its hard for you to find my stuff
  • Its nearly impossible to be made aware of stuff
    we dont know about
  • Stuff everything from raw instrument data to
    video clips to compiled human analysis

4
The Big Problem
  • It is increasingly more difficult to understand
    which stuff is relevant.
  • In the world of decision making support, we cant
    always anticipate where the next piece of
    required (sometimes vital) information is.

5
The Big Problem
  • Its really big
  • (1) Corpus Size Rate of Growth.
  • NASA has a tremendous amount of data collected
    over the last 50 years.
  • The exact size and growth rate of our data
    collection are unknown
  • Employees, partners and customers are generating
    new data continuously
  • Efforts to assess the value of our data
    collection in either informational or financial
    terms are difficult
  • Neither the collection nor its growth rate are
    likely to diminish significantly in the next 5
    years
  • 13 of NASAs budget is spent supporting
    information technology.

6
The Big Problem
  • Its really complicated
  • (2) Variety of Data Sources Types.
  • Data and information has great variety in origin,
    source and type ranging from one-of-a-kind
    instruments and software, to last-of-its-kind
    legacy systems.
  • Collection includes foundational science data and
    PowerPoint briefings
  • stored in man-made appliances and human
    experiences
  • Computer systems and instruments are diverse,
    spread out across the globe
  • in some cases, beyond
  • Data consumers are also potential producers
  • regenerating data through analysis, compilation,
    edits and emails. Nearly each instance is another
    piece of data added to our unorganized
    collection.

7
The Big Problem
  • We are not monolithic
  • (3) Customer Environment
  • For NASA, the world is our data and information
    community. Our customers vary from schoolchildren
    to university researchers. They encompass nearly
    all of the disciplines of science, engineering
    and project management. They speak many different
    natural languages accented with unique science
    nomenclatures, technical idioms and the
    contextual nuances of their own experiences.
  • Even where there is a common language, humans use
    different vocabularies and meanings, and domain
    specialists may have difficulties in conveying
    information to non-specialists.

8
The Big Problem
  • Searching alone wont help
  • (4) Discovery Relevance
  • As the quantity and variety of data / information
    increases, it is increasingly more difficult to
    find information that you or your organization
    has collected
  • Learning of related information outside of your
    own organization seems impossible, but is
    required to achieve a more complete and effective
    understanding of many of our activities. We
    cannot anticipate the exact piece of information
    we will need, but we need to be aware of it
    nonetheless. In other words, a mechanism is
    required to present data in relevant context to
    each unique situation without knowing the data
    source or potential consumer(s) in advance.

9
A bunch of us thought about this problem a whole
lot
  • Major discovery The problem is solvable!
  • 5 of us co-authored a white paper
  • Schain, Raskin, Wilson, Keller, Truszkowski
  • Lots of early reviewers and editors
  • Big contributions from Jeanne Holm, Scott
    Glasser, John McManus, Rob Winters, Kendall
    Clark, Bijan Parsia, Jim Hendler

10
Our Options
  • First, admit that we have a problem.
  • Then, either
  • (1) We fix the data problem, or
  • (2) We dont fix the data problem.
  • We let the data problem fix itself.
  • We let someone else fix the data problem.
  • (NASA-Speak for nobody fixes it)
  • Lets just say, option 1

11
Some derived requirements (or are they
inferred?)
  • Mustnt shock the system.
  • Human, financial, or in-situ services / data
    independence
  • Reality One size does not fit all (its
    complex).
  • This effort is across the board! And involves
    everyone.
  • Must enable experts to effect adds/changes within
    their discipline while making results available
    for others.
  • Must scale up and be available across timelines.
  • Uniform/understood machine interfaces in a
    distributed service
  • Must provide accurate results promptly.
  • Requires machine assistance
  • Must work in a global heterogeneous environment.
  • Expressed at levels of common representation
    outside of OS or network funnies

12
How? Some Tagging
  • Linking, Expression, Extension, Relevance
  • Where information about an application, a
    service, a dataset is available we should tag it
  • Where it is not available we should consider ways
    of adding it
  • Tags should be considered in terms of context,
    relationship and meaning
  • Annotate Share
  • Provide a mechanism where subject matter experts
    add metadata or context and leave it for others
    to build on incrementally
  • Strategy must enable data and information
    customers to drive incremental metadata
    organization based on their needs.
  • Each valid construct can be left for others to
    reuse and repurpose.

13
And Organizing
  • Once tags or mechanisms for collecting metadata
    is in-place, we need to organize it.
  • and assure that logic and relationships within
    applications or vital contextual instrument
    information is inserted and maintained.
  • Making it discoverable
  • Search, browse, and query
  • Machines connecting the dots not just people
  • Leverage what we have and what we know
  • Reuse and leverage available ontologies
  • SWEET, what else you got?
  • Scan Stanford, swoogle, etc
  • Validated and use a library service
  • Currency and validity? Yikes!
  • Think about SNMP wrapped in a little OWL/Pellet
  • Slurp and translate existing schemas into RDF
    ontologies or OWL depending on requirements and
    opportunities

14
The Obvious
  • Buzzwords lose currency
  • The Semantic Web versus Semantic Web
    Technologies
  • KM (not a noun), Taxonomies, Ontologies, Web
    Services, Web2
  • This is a very big job,
  • It is a very big problem with few alternative
    solutions
  • We dont have all of the skills to do it
  • One size does not fit all, this is an integration
    effort
  • Big data bases dont suck
  • Except when you try to integrate them
  • Requires careful data stewardship as well as
    ownership
  • The problem will not be solved all at once
  • But this approach will provide immediate
    incremental and significant improvement
  • Positions us to easily adapt to changing data
    requirements and devices.
  • Note blue items are needed to make the problem
    solvable

15
Approach
  • Establish Leadership Organize
  • Formal project lasts for 5 years
  • Specialized management disbands and normal
    operational support takes over if proper momentum
    is established
  • Achievable if focus is on specific areas
  • Advertising brings in important stakeholders and
    knowledge workers
  • Publish Principles, Advertise Strategies
  • Establish Infrastructure Processes
  • Tag everything we can, even if its just a little
    bit
  • Make it easy on folks to do it and to add to it
  • Annotate Share
  • Each valid construct left for others to build on
  • Make it easy on folks to leave stuff behind in
    libraries
  • Establish Attractors
  • Implementing key attractor services will create
    a Network Effect

16
Leadership Tactics
  • Social Advertisement and Primers
  • Show Tell
  • Demonstrations, examples, proofs
  • Understanding the basics
  • A big picture orientation made up of the
    smaller components
  • How do the parts fit together?
  • Parsers Triple Stores RDF, Kowari, RDFLib,
    3Store, Seseame Query languages Reasoners
    FaCT, Racer, Pellet, Jena Open vs Closed World
  • Publish design principles
  • Data organization constructs (e.g. taxonomies,
    ontologies, XML schemas) must be reusable and
    available for computer systems/services. The URIs
    used for uniquely identifying these constructs
    should resolve to their respective schemas
  • Web services must be made available for reuse
    (and strategies need to be developed to identify
    service types, applications, and rules governing
    their availability)
  • Yield to the greater concept even if your focus
    is more narrow
  • Keep it simple to maximize agility and re-use
  • Leverage our existing web infrastructure.

17
Leadership Tactics
  • Publish strategic principles
  • Keep data and contextual validity close to the
    data owners and subject matter experts who care
    about it
  • Develop a strategy about data curator functions,
    providing assurance, change control, etc
  • Accept that some data constructs or services may
    not be fully mature at the outset but can be
    driven by subsequent customer use and applied
    benefit
  • Protect individual privacies as disparate systems
    become available to wider use
  • Establish a presence/partnership on standards
    bodies (the W3C Semantic Web Best Practices)
    other standards projects (PAW) and with research
    (UMD, Stanford, MIT, Southampton)
  • Conduct InterOps and Teach-ins
  • Publish target designs so that application
    developers can model their systems against a
    standard, leveraging the work that has gone
    before, and enabling fast track extension and
    integration to other systems
  • Understand, document and manage to the measurable
    success criteria for the initial, intermediate
    and longer terms of the effort.

18
Just a bit of Infrastructure
  • Enough for services to be organized,
    discoverable, and reusable.
  • KR libraries
  • Make XML, RDF, OWL, Thesauri, and Taxonomies
    available as a library service, enabling
    authorized reuse from authoritative sources
  • We will need to establish manageable,
    semantically rich official libraries for unique
    Knowledge Representations (e.g., payload
    processing constructs, human relation constructs,
    vehicle and instrument constructs)
  • We need to adopt more universal KRs constructed
    outside of NASA but certified for our use (e.g.,
    astronomy and celestial mechanics constructs,
    metallurgy, telemetry, navigation constructs,
    facilities, computers and other capital
    investment constructs)
  • Architecture should enable ownership/authorship
    and responsibility for domain experts to give
    others confidence and trust through provenance,
    currency and (more importantly) through
    successful results
  • Think SVN Annotea
  • A process for conversion or translation of
    traditional schemas or corpora will need to be
    formalized so our repositories of
    production-worthy ontologies can grow easily.

19
A bit more infrastructure
  • Service Advertisement Repositories
  • Established for individuals to publish available
    web services that can communicate with other
    existing services. The goal is to enable our
    computers to know when a new service has come
    online, understand what it does, employ its
    functions as part of generalized tasks, and
    specify under what conditions the service can be
    used and trusted.
  • Testing with careful consideration to browsing,
    discovery, and trust capabilities.
  • Metadata Collection and KR Construction
  • Provide tools that either harvest existing
    metadata or provide computer assistance in
    asserting new metadata.
  • Efficient mechanisms for populating KBs should be
    assessed and some preliminary findings tested
    against candidate systems. Natural Language
    Processors that assist in determining likely
    metadata elements, as well as simple mechanisms
    for customers to add semantic annotations, should
    be evaluated in parallel.
  • The Drawer of Kitchen Utensils
  • Integrating existing information services.
  • GlueCode that will add metadata awareness to
    infrastructure components, instruments and
    existing applications.
  • Collaboration awareness for Wikis, DMSs,
    Del.icio.us, Workflows, Flickr, and so on.
  • Tools that will generate RDF from office-type
    applications.
  • Tools that will generate ontologies from database
    schemas.

20
Establish Attractor Services
  • The network effect describes how a service
    becomes more valuable as more and more people
    adopt it. As more services and capabilities get
    incorporated, it motivates more individuals and
    more services to participate. The more services
    we tie together, the greater the utility. The
    greater the utility, the more services get
    incorporated.
  • Linking People,Organizations, Projects, and
    Skills
  • Metadata search and inference in image
    inventories
  • Federal Enterprise Architecture and Capital
    Investments
  • Semantically-enriched Document Management
  • Integrating Science Knowledge
  • Semantically-Enabled Workflows

21
Anticipated Positive Results
  • Once established the attractor services will be
    able to use the same KRs and KBs making the
    overall capabilities much more powerful
  • Raises the bar for acceptable solutions
  • Expansion of current skill sets
  • Social Networks will facilitate tighter work
    environments even when geographically dispersed.

22
What has happened so far
  • JPL workshop
  • White paper is an accepted agency EA
    recommendation
  • Hounding
  • CIOs are aware, learning more, and seeking to
    support
  • Demos
  • More hounding

23
The JPL Workshop
  • The demo is a proud moment
  • LDAP feed via Python script and converted to RDF,
    stored in 3store, queries via a Mspace java code
    talking to the 3store, forked to redfoot, passing
    a uri for each instance for browsing, plus a BSd
    FOAF network creates an artificial representation
    of FOAFKnows
  • This was important because
  • The building blocks had not been arranged like
    this before
  • The 9 of us worked together virtually
  • Got one more to do for senior NASA managers
  • This time add a connection to a web service
  • If successful, formalization and funding to
    implement the production service

24
It doesnt matter that the technologies are
immature
  • We have a LOT to do and some stuff can/should be
    done now
  • Learn from doing/drive the technologies
  • Maintain linkage with you folks establish strong
    preferences with industry
  • By the time we need the other bits, they will be
    ready too
  • Or we should engage the development or standards
    committees and drive our requirements instead of
    being dragged behind

25
More, Please
  • SWEET, SciFlo, IO, Workflows, and all the other
    stuff that is easier to integrate together!
  • FOAF, DOAP, Wikis, Triple Stores, KRs,
    Del.icio.us, Flckr, Annotea
  • Trading and Talking
  • Presents new alignment possibilities
  • Oracle, Adobe, Microsoft, Cisco, W3C
  • Think about it
  • Re-combining and combing the stuff we are already
    collecting

26
Thinking Back on the Future
  • The big question for me was, is this real?
  • Or is it Can it be made real?
  • It is the same question for you guys.
  • Do you want to see your ideas implemented
    globally? What if you could integrate them in?
  • Important nature of this group.
  • And what are you going to do about good and evil?
  • Every time I talk about these technologies with
    people on the outside they become frightened
  • Protecting your privacy rights
  • Protecting your public work

27
Thank you
Write a Comment
User Comments (0)
About PowerShow.com