The Evolution of the Net: Predicting Global Infrastructure - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

The Evolution of the Net: Predicting Global Infrastructure

Description:

Department of Computer Science seminar. University of ... Technology Trends. IEEE Computer for January 2002. Information Infrastructure for Trends issue ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 66
Provided by: can8
Category:

less

Transcript and Presenter's Notes

Title: The Evolution of the Net: Predicting Global Infrastructure


1
The Evolution of the NetPredicting Global
Infrastructure
Bruce R. SchatzCANIS Laboratory Graduate School
of Library Information Science schatz_at_uiuc.edu,
www.canis.uiuc.edu
Department of Computer Science seminar University
of Illinois, February 14, 2005
2
Art of Physical Architecture
3
Art of Logical Architecture
4
The Evolution of the Net
  • Niels Bohr on Quantum Theory
  • Prediction is very Difficult, especially about
    the Future

5
THE THIRD WAVE OF NET EVOLUTION
CONCEPTS
OBJECTS
PACKETS
6
Computer Science and Infrastructure
  • Transparent Federation across Sources
  • Generic Protocols for Global Infrastructure
  • Ultimate Goal is cyberspace visions of
  • being one with all the worlds knowledge

7
Computer Science and Infrastructure
  • 1985 Operating Systems caching
  • 1995 Database Management tagging
  • 2005 Information Retrieval clustering
  • 2015 Artificial Intelligence recognizing

8
Linguistics Levels and Universal Units
  • 1985 Syntax Files (wholes)
  • 1995 Structure Records (parts)
  • 2005 Semantics Concepts (meaning)
  • 2015 Pragmatics Features (reality)

9
Evolution of Information Retrieval
Evolution of Information Retrieval across the Net
from Bruce R. Schatz, Information Retrieval in
Digital Libraries Bringing Search to the Net
cover article in Science, vol 275, Jan 17, 1997
special issue on Bioinformatics
10
  • The 80s
  • APOLLO
  • Regional File Systems
  • DREAM
  • WorldWide Information Spaces

11
1985 Syntax Federation
  • Same Query into Multiple Sources
  • Results return Uniform Packages
  • Packets are for Bits, but Objects need more
  • Information Units are for Database Items

12
1985 Technology Environment
  • CMU Computer Science Andrew
  • Apollo Domain distributed file system
  • Xerox Star multimedia document system
  • Bellcore Network Systems Fibers
  • Telenet International Packet Switches
  • Dialog Bibliographic Text Searches

13
Telesophy Prototype
  • Distributed Documents
  • Distributed Collections
  • Multimedia Documents
  • Networked Hypertext
  • Document Browsing (links across sources)
  • Document Search (texts across sources)

14
Telesophy Session
15
Telesophy Implementation
  • Bitmapped Workstation with Custom Software
  • 30K Apollo with 10Mb/s WAN
  • Windows via Brown hypertext
  • Objects via Xerox Smalltalk
  • Information Units and Data Items
  • 300K Units across 20 sources
  • Bellcore RD, 2.5M 1984-1988

16
Operating System Research
  • Browsing requires Caching across Internet
  • Raw bandwidth insufficient
  • 200ms Ping versus 250ms Saccade
  • Lookahead Applications Specific Protocols
  • 1987 Internet Research Task Force
  • 1989 ARPANET 20th Anniversary
  • 1990 Dissertation on Interactive Retrieval

17
  • The 90s
  • NETSCAPE
  • WorldWide Information Spaces
  • DREAM
  • Structured Ranked Search

18
1995 Structure Federation
  • Search using Parts of Documents
  • Transparent merge different Schema
  • Results return Complete Displays
  • Displayers invoked for all types

19
1995 Technology Environment
  • NCSA and the World-Wide Web
  • Mosaic multimedia document browsing
  • HTTP standard query protocol
  • University Library and Online Retrieval
  • Ovid full-text journal searching
  • SGML standard document protocol

20
DeLIver System
  • Full Distributed Documents
  • Full Displays with tables and equations
  • Distributed Collections from publishers
  • Single Federated Collection
  • Streamlined search using tag structure
  • Canonical tag schema with translation

21
DeLIver Session
22
DeLIver Implementation
  • Desktop PC plus Custom Software Integration
  • 5K IBM Personal Computer
  • Mosaic via NCSA hypertext
  • Displays via SoftQuad viewers
  • Custom DTD and SSL for tags and styles
  • 100K articles for 3000 users
  • NSF DLI, 5M 1994-1998

23
Database Management Research
  • Metadata Extraction for Structure Federation
  • Raw schema insufficient
  • Different names and different types
  • Author tags in physics vs mathematics
  • 1995 interactive databases using Mosaic
  • 1997 Beat Elsevier using canonical tags
  • 1999 production distributed XML federation

24
  • The 00s
  • GOOGLE
  • Structured Ranked Search
  • DREAM
  • Semantic Concept Switching

25
2005 Semantic Federation
  • Search using Concepts above Words
  • Extraction of Concepts from Documents
  • Statistical Index on Community Collections
  • Concept Navigation across Collections

26
2005 Technology Environment
  • Web Portals and statistical NLP
  • Google statistical linked contexts
  • NLP statistical generic parsers
  • Fast Processors and Big Disks
  • Gigaflops Beowulfs and cluster computing
  • Terabytes RAIDs and literature scaling

27
BeeSpace System
  • Fully Parsed Documents
  • Concepts and Entities auto generated
  • Distributed Collections from communities
  • Fully Related Concepts
  • Switching across Community Repositories
  • Automatic Links to Entity Databases

28
BeeSpace Session
29
BeeSpace Implementation
  • Commodity PC plus Custom Software
  • 1K Dell Personal Computer
  • 15K Server 1 Gflops 2 TBytes
  • Semantic Indexing generic scalable
  • Concept Extraction and Normalization
  • Concept Co-occurrence on Collections
  • 50M articles across 50K repositories

30
Information Retrieval Research
  • Statistical Clustering Equivalent Phrases
  • Raw phrases insufficient
  • Phrase parsing with normalization
  • Entity recognition with normalization
  • 1998 semantic indexing
  • (concepts from terms)
  • 1999 information spaceflight
  • (categories from documents)

31
CONCEPT SPACES
  • from Objects to Concepts
  • from Syntax to Semantics
  • Infrastructure is Interaction with Abstraction

Internet is packet transmission across
computers Interspace is concept navigation
across repositories
32
LEVELS OF INDEXES
33
Technology Trends
  • IEEE Computer for January 2002
  • Information Infrastructure for Trends issue
  • Document Representation (Semantic Web)
  • Language Parsing (TIPSTER)
  • Statistical Indexing (TREC)
  • Peer-Peer Networking (SETI_at_home)
  • Vocabulary Switching (UMLS)

34
SCALABLE SEMANTICS
  • Automatic indexing
  • Domain-Independent indexing
  • Statistical clustering
  • Compute Context of
  • concepts within documents
  • documents within repositories

35
COMPUTING CONCEPTS
92 4,000 (molecular biology) 93 40,000
(molecular biology) 95 400,000 (electrical
engineering) 96 4,000,000 (engineering) 98
40,000,000 (medicine)
36
SIMULATING A NEW WORLD
  • Obtain discipline-scale collection
  • MEDLINE from NLM, 10M bibliographic abstracts
  • human classification Medical Subject Headings
  • Partition discipline into Community Repositories
  • 4 core terms per abstract for MeSH classification
  • 32K nodes with core terms (classification tree)
  • Community is all abstracts classified by core
    term
  • 40M abstracts containing 280M concepts
  • concept spaces took 2 days on NCSA Origin 2000
  • Simulating World of Medical Communities
  • 10K repositories with gt 1K abstracts (1K w/ gt
    10K)

37
COMMUNITY PROCESSING
38
INTERSPACE NAVIGATION
  • Semantic Indexes for Community Repositories
  • Navigating Abstractions within Repository
  • concept space category map
  • Interactive browsing by Community experts
  • www.canis.uiuc.edu/interspace-prototype

39
Interspace Remote Access Client
40
Navigation in MEDSPACE
  • For a patient with Rheumatoid Arthritis
  • Find a drug that reduces the pain (analgesic)
  • but does not cause stomach (gastrointestinal)
    bleeding

Choose Domain
41
Concept Search
42
Concept Navigation
43
Retrieve Document
44
Navigate Document
45
Retrieve Document
46
Concept Navigation
47
(No Transcript)
48
SWITCHING
  • In the Interspace
  • each Community maintains its own repository
  • Switching is navigating Across repositories
  • use your vocabulary to search
    another specialty

49
CONCEPT SWITCHING
  • Concept versus Term
  • set of semantically equivalent terms
  • Concept switching
  • region to region (set to set) match

50
Biomedical Session
51
Categories and Concepts
52
Concept Switching
53
Document Retrieval
54
THE NET OF THE 21st CENTURY
  • Beyond Objects to Concepts
  • Beyond Search to Analysis
  • Problem Solving via Cross-Correlating Multimedia
    Information across the Net
  • Every community has its own special library
  • Every community does semantic indexing
  • The Interspace approximates Cyberspace

55
  • The 10s
  • Interspace
  • Semantic Concept Switching
  • DREAM
  • Continuous Feature Monitors

56
2015 Pragmatics Federation
  • Beyond Words and Concepts to Reality
  • Feature Vectors describing Situation
  • Each Individual has Vector (lt Community)
  • Discrete Samples into Continuous Monitors

57
2015 Technology Environment
  • Continuous Vector Recording
  • Health Grid personal lifestyle monitors
  • Peer-to-Peer beyond Napster and Amazon
  • Individual User Modeling
  • Cohort Grouping custom clustering
  • Adaptable Interfaces multiple levels

58
Lifestyle Monitor System
  • Continuous Monitoring
  • Adaptive Questionnaires full-spectrum
  • Distributed Collections from individuals
  • Situational Analysis
  • Structured Vectors custom for Individuals
  • Population Cohorts for Decision Support

59
Lifestyle Monitor Questions
Sample General Health Questions for User
Modeling
60
Lifestyle Monitor Session
61
Artificial Intelligence Research
  • Structured Vectors Individual customized
  • Raw concepts insufficient
  • Adaptive Concepts for individual situations
  • Structured Vectors for cohort clustering
  • Situational Analysis infrastructure support
  • 2007 Internet Health Monitors prototypes
  • 2011 Population Health Monitors for chronic
    illness regionally deployed

62
THE DISTRIBUTED WORLD
  • Community Repositories in the Interspace
  • Peer to Peer Networking Infrastructure
  • Every Person performs Every Role

USER request LIBRARIAN reference INDEXER class
ify PUBLISHER quality AUTHOR generate
63
FEATURE VECTORS
  • from Concepts to Features
  • from Semantics to Pragmatics
  • Infrastructure is Interaction with Abstraction

Interspace is concept navigation across
repositories Intermind is feature comparison
across individuals
64
Towards the Intermind
  • Beyond Concepts to Features
  • Beyond Analysis to Synthesis
  • Problem Solving via Cross-Correlating Universal
    Knowledge across the Net
  • Every individual has its own special vector
  • Every viewpoint does semantic clustering
  • The Intermind is true Cyberspace

65
Today the Hive Tomorrow the HiveMind
Write a Comment
User Comments (0)
About PowerShow.com