Title: Introducing: The Cyc Foundation
1IntroducingThe Cyc Foundation
2Motivations
Wikimedia Foundation Imagine a world in which
every single person is given free access to the
sum of all human knowledge. That's what we're
doing.
Cyc Foundation Imagine a world in which every
single person is given free access to programs
that reason with the sum of all human knowledge.
That's what we're doing.
3Topic Map Top Level
4Cyc Reasoning System
Knowledge Users
User Interface (with Natural Language Dialog)
Other Applications
Knowledge Authors
Cyc API
Knowledge Entry Tools
Cyc
Reasoning Modules
Cyc Ontology Knowledge Base
Interface to External Data Sources
External Data Sources
Data Bases
Web Pages
Text Sources
Other KBs
5Help Find Information by Inference (KB)
Query Someone happy
Caption A man watching his daughter take her
first step
6Help Find Information by Inference (KB)
(?x) (feelsEmotion x Happiness Positive)
Query Someone happy
Caption A man watching his daughter take her
first step
Logical Inference (deduction)
(?x,y) (and (father x y) (gender x Female) (sees
x y) (walking
7Help Find Information by Inference (KB)
(?x) (feelsEmotion x Happiness Positive)
. . .
(implies (and (isa ?BIG-EVENT
HumanLifecycleMilestone) (doneBy
?BIG-EVENT ?CHILD) (sees ?PARENT
?BIG-EVENT) (children ?PARENT
?CHILD)) (holdsIn ?BIG-EVENT
(feelsEmotionTypeAtLevel ?PARENT
(PositiveAmountFn Pride))))
Logical Inference (deduction)
. . .
(?x,y) (and (father x y) (gender x Female) (sees
x y) (walking
8- ALICE
- JUDGE119260I WILL ASK YOU SOME QUESTIONS
- PROGRAM9260Do you think your plan will
succeed? - JUDGE1192616WHAT IS THE COLOR OF A BLUE CAR
- PROGRAM92616WHAT COLOR IS a blue car
- JUDGE1192635WHERE IS SUES NOSE WHEN SUE IS
IN HER HOUSE - PROGRAM92635Where it belongs.
- Try searching the World Wide Web.
- JUDGE1192649WHAT HAPPENS TO AN ICE CUBE IN A
HOT DRINK - PROGRAM92649Try it.
- JUDGE1192725HOW IS THE FATHER OF ANDY S
MOTHER RELATED TO ANDY - PROGRAM92725Fine as far as I know.
9Natural Language Understanding requires having
lots of knowledge
- 1. The pen is in the box.
The box is in the pen. - 2. The police watched the demonstrators
- because they feared violence.
- because they advocated violence.
10System Ready for input.
User what do you know about anthrax?
System By anthrax do you mean (1) the
heavy metal band or (2) bacteria?
User 2
System Here is what I know about it
Anthrax is a species of organism. All anthrax
is a bacterium. All anthrax can lethally
infect mammals. Anthrax is extremely toxic for
people to inhale. Iraq has biological weapons
that deliver anthrax. Anthrax is opaque,
tangible, microscopic, . Anthrax can be dried
into spores stored indefinitely
11System Ready for input.
User what do you know about TR1?
System I know nothing about it. What sort
of thing is TR1?
User bacteria
System TR1 lethally infects_________?
User People
System By People, I assume you mean
human beings, not People Magazine.
System Does it infect some broader category of
animals (e.g., air-breathing vertebrates,
chordates, placental mammals, primates,
eucaryotes, etc.)?
User mammals
12System Is TR1 similar to anthrax?
User yes
System Which of the following are true of
TR1 . . .
User Al Qaida possesses some TR1.
System Ready for input.
User Does Osama Bin Laden control anything that
can be the basis of weapons of mass destruction?
System Yes, some TR1.
User please explain
System Osama Bin Laden controls Al Qaida.
Al Qaida posseses some TR1. TR1 is a
bacteria that lethally infects people.
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22Efficiency vs. Expressiveness
Continuing improvements in inference performance
wont negatively effect expressiveness.
C
Use two cooperating languages (EL and HL) to
escape the limitations of an age-old tradeoff.
Efficiency
PASCAL
HL (heuristic level language)
EL (epistemological level language)
LISP
First-order logic
nth-order logic
English, German
Expressiveness
23NOW CyN in Doom3 (2005)
24BURC Bootstrapping Using ResearchCyc
- Goal To extend Cycs knowledge base using
relationships implied to be possible, normal or
commonplace in the world - Prior work with Cyc knowledge entry has been
manually oriented - How will we collect common sense without a body
and manual labor? - Read, Parse, Mine!
- Proposal Read text, Parse into a database,
Extract relations between words, Propose
hypothetical relations between concepts
25BURC Basic Analogy
- The Shotgun approach to the Human Genome
- Extract millions of fragments
- Knit them back together by finding commonalities
- Will it work for the Human Memome?
- James Burke Mr. Connections
Lenats Bootstrap Hypothesis once Cyc reaches a
certain level/scale it can help in its own
development and start using NLP to augment its
knowledge base
26Mining Adjective Knowledge Example
- white blouse as factoid fragment
- Hypothesis (plausibleValueOfType Blouse
mainColorOfObject WhiteColor)
27Flow of Processing
BNC Data
Parser
1
Parser
2
Parser
3
Parser
4
Parser
5
Frag
Frag
Frag
Frag
Frag
File
File
File
File
File
Merged
Frag File
Extractor
/
DB Manager
Hypothesis
File
Cyc/Rcyc
Link
Fragments DB
28(No Transcript)
294
30The driver of the power of intelligent systems
is the knowledge the systems have about their
universe of discourse, not the sophistication of
the reasoning process the systems employ. Cyc has
not only the worlds largest knowledge base, but
the best represented from a technical point of
view. Ed Feigenbauminventor of the first expert
systemeditor of the AI Handbook
31 People have silly reasons why computers dont
really think. The answer is we havent programmed
them right they just dont have much common
sense. Theres been only one large project to do
something about that, thats the famous Cyc
project. -- Marvin Minsky
32How has Cycorp done?
- 20 years
- 3 million facts and rules (hand-entered)
- Compelling demos
- Some applications (constrained by business model)
- The basis for much greater growth
- If the right way to build an A.I. involves
giving Cyc away for free, that is what we will
do. - Doug Lenat (repeatedly)
- Note Jury is out on what the right way is
33Cycorp True to its Promise
- OpenCyc
- The entire Cyc structural ontology FREE
- 300,000 concept terms, 2M facts and rules
- ResearchCyc
- Equal to Full Cyc (w/ Research-only license)
- Source code for inference engine not released
- API with 18,000 functions and macros!
- Ability to compile in your own additions
- Q Will more be released? A It depends.
- Cycorp must financially support its own RD.
- Existing releases must result in major project
benefits.
34Time for the Next Phase
- Cycorp has gotten us to where we are
- Representational ability
- Inference ability
- and will continue (RD leader, commercialization
- The rest of the world will help get us where we
are going - Breadth of content
- Broad real-world diffusion
The thinking that got us to where we are today is
insufficient to solve the problems that exist
today. To solve today's problems requires a new
level of thinking. -- Einstein
35Building Cyc qua Engineering Task
learning by discovery
learning via natural language
1984
2004
2006
codify enter each piece of knowledge, by hand
CYC
750 person-years 21 realtime years 75 million
36Building Cyc qua Engineering Task
10 years
1984
2004
2006
codify enter each piece of knowledge, by hand
1000 years
CYC
750 person-years 21 realtime years 75 million
37How will we get the knowledge?
Games That Matter!
38Foundation as Continuation
- Are we trying to make an A.I.?
- No.
- Are we trying to make computers behave much more
intelligently? - Yes!
39Mission (DRAFT)
The Cyc Foundation has been formed as an
independent not-for-profit organization to
hasten the arrival of intelligent tools that
will help humanity.
40Assumptions
- (Currently) 9 ideas that shape strategy,
objectives and policy - These may need to be validated, modified or
augmented - In some cases, assumptions are followed by
related policy
41Assumption 1
Long before computers are as smart as people,
they will be (in some cases already have been)
put to use to cure disease, address hunger
problems, make important new scientific
discoveries and help people work together.
Smarter computers will do a better job of this.
42Assumption 2
Cycorp has developed and cared for what we
believe is an important piece of the AI
puzzle. They have always wanted to release it to
the public, but it had to be when people could
realistically develop it further on their own
without in some way endangering the project. One
fear was forking, or creating incompatible
variants of the knowledge base. Cycorp and The
Foundation will cooperate on 1 KB.
43Flow of Cyc Data
Cyc Foundation
Cycorp
RCyc User
Team - Subject-matter expert - Ontologist
Gamer / Wikipedia user
44Assumption 3
The knowledge that will give computers
human-like intelligence ultimately needs to be
free. That's our best hope of having it put to
best use. Portions of knowledge will always be
held proprietary. The more shared a piece of
knowledge, the greater will be the force pulling
all of its representations toward freedom (to
avoid the burden of maintaining a non-standard
representation).
45Assumption 4
Proposed Semantic Web standards (such as those
related to OWL) are an important step in the
right direction, because they provide a
foundation for working with meaning on the
Web. The Cyc ontology will be a valuable
addition, because it can act as a semantic hub,
allowing us to have shared meaning. There is some
concern that a top-down central ontology will
dictate use of terms that may not meet a
projects needs. We will be able to show that use
of the Cyc ontology can satisfy both needs and
will be a useful complement to the great work
that has already been done toward the Semantic
Web.
46Assumption 5
We all have something to learn. We all have
something to teach. The Foundation mission will
benefit from a very broad base of support, rather
than the traditional rule by the technical elite.
47Assumption 6
For this effort, focused work by many will be
more valuable than genius work by a few. To be
most helpful, people should work together, and on
tasks where they are capable of contributing
successfully. (Example dont go off and try to
solve the A.I. problem by yourself.)
48Assumption 7
Regular humans can be turned off by overly
technical talk that is out of place and rightly
so. We need to be inclusive in our language and
in our activities in order to ensure the broadest
base of support and participation. This is
especially true in the Cyclify initiative.
49Assumption 8
- There is no us and them
- The Foundation is managed by its volunteer board
and run by its volunteer members - The Foundation will start with no employees
- The will be no BDFL Benevolent Dictator for Life
50Assumption 9
- Fun is mandatory!
- By comparison, contributing to SETI is like
cleaning your oven while you sleep. - This work will be hands-on, compelling and
(hopefully) addictive. - If youre not having fun, find out why and fix it.
51Foundation Goals
Cyclify
- Convert human knowledge to a form that computers
can reason with - Grow the Cyc Ontology and KB Exponentially
- Establish a standard vocabulary and language for
representing concepts knowledge - Support the creation of intelligent tools
- Promote free and efficient knowledge transfer
52Cyclify Knowledge Collection Activities
- Web Games
- Validate acquired knowledge
- Multiple-choice fact entry
- More?
- Wikipedia Linking
- KR Dating Service
- Wiki-based knowledge entry
- A SME paired with an ontologist
- WordNet Linking
53Playflow Within Cyclify
RCyc User
Cycorp K. Acquisiton Data
Wikipedia Data
GameServer
Wiki Knowledge Server
RCyc
RCyc
Team - Subject-matter expert - Ontologist
Gamer
Wikipedia user
54Im thinking of a sentence
StatusI have 2 answers
Fibromyalgia is caused by ticks.
True
False
Dont Know
Doesnt make sense
Because I read about it on the web.
Score 24
55StatusI think this sentence is probably not
right
Submitting...
Thank you! Answers 2 You agreed with 100 I
now have a better understanding
of Fibromyalgia is caused by ticks. Score 2
Next
Score 26
56Current Architecture
computer (inside)
computer (outside)
Cyc Image
Applet
Applet
GAFs web gatheredhypothesizedasserted
Applet
Forward rules
Question Server (java)
KAGs
PostGRESdatabase
SubL form, runningKAG-collecting query
Populator (java)
XMLfile
XMLfile
scp
DMZ Boundary
57Cyc Foundation Projects
- Nonprofit Formation (planning/budgeting/filing)
- Foundation Website
- Cyclify
- Fundraising
- Membership management
- Events
- ResearchCyc
- Recommend Cyc features / functions / design
- Help with ResearchCyc testing, documentation
58Budgeting
- Must develop budget related to Year 1 plan
- Possible areas of spending
- Legal filings
- Server hosting
- W3C membership
- Conference attendance
- Fundraising
59Foundation Website
- Requirements
- Content management features
- Collaboration features
- Out-of-the-box ease of use
- Free
- Currently evaluating Joomla (Mambo)
- Desired launch May 15
60Cyclify Projects
- First Web Game
- Develop game
- Viral marketing
- Add wiki linking activity
- Wiki Knowledge Collection
- Set up wikip.cyclify.org
- Add frame for ontologizing
- Feed wikip links to Web game
- Back End
- Design and implement PlayFlow
- Submit collected knowledge to Cycorp
61Fundraising
- Individual Memberships
- Free membership for first 6 months for Cyclify
members and ResearchCyc users? - How much?
- What do you get?
- Corporate Donations
- Need to prepare story
- Seems feasible to get donations
62What does nonprofit mean?
- Cannot have investors or disburse earnings
- Can have earnings, though
- Revenues must come from services that are within
mission - 501(c)(3)? (like Wikimedia Foundation)
- Or 501(c)(6)? (like Eclipse Foundation)
63The Foundation Board of Directors
Name
Position
Role
64The Foundation Membership
65ResearchCyc Users
Government-related
Xerox PARC
Government
Commercial
Language Computer Corporation
ANSER, Inc.
Air ForceRome Labs
NTTCommunications Science Laboratories (Japan)
Stones Throw Technologies
21st Century Technologies
SRI
HoustonVA Medical Center
ISI
Austin Info Systems
Fraunhofer Institute
Daxtron Labs
Lockheed Martin ATLD
Sapio Systems (Denmark)
U of Illinois Urbana-Champaign
Terra Incognita
University
U of Maryland
MIT Media Lab
Stanford NLP Dept.
Trimtab Consulting
Northwestern U
TNO-DMV (Netherlands)
U of Pennsylvania
Rensselaer AI and Reasoning Lab
Microfabrica, Inc.
Knowledge Media Institute, Open University
LBJ School of Public Affairs
New MexicoHighlands Univ.
Institute for the Study Of Accelerating Change
U of Stuttgart
Harvard U
U of Toronto
U of Minnesota
Witan International
NPOs
Radboud U (Netherlands)
Tokyo Inst. of Technology
Linkoping U (Sweden)
U of Hawaii
66How can I help?
- Humans (a.k.a. common sense experts)
- Programmers
- Web programmers
- Cyc programmers
- Ontologists
- Subject-matter experts
- Bloggers
67Human Cyclists
- Play the Web Game
- Come up with new game ideas
- Link Wikipedia to Cyc
- Learn more about Cyc
- Befriend an ontologist
- Tell a friend about Cyclify
- Write to a blog about Cyclify
- Help with viral marketing
- Design a logo
- T-Shirts Buy one, or Create and sell them
From now on, were all Cyclists people who
interact with Cyc in one way or another.
68Programmers
- Help design and build a web services interface
- Learn the architecture of Web Game 1
- Design an add-on for the Web game
- Learn how to use the question server
- Propose a new game
- Help develop/support technical infrastructure
- Help organize documentation
- Help write the Cyc books
- to be published by O'Reilly
69Ontologists
- Identify gaps in the knowledge base
- Befriend a Subject Matter Expert
- Work together on a domain
- Befriend a Human Cyclist
- Teach one who wants to learn basic ontology
skills - Help organize documentation
- Help write the Cyc books
70Bloggers
- Blog about Cyclify
- Link to each others blogs
71Timeline (Milestones)
- May 15 Launch Foundation Website
- Build membership up until July 15
- June 15
- File Articles of Formation w/ Sec. Of State
- First Web game in beta
- July 15 Launch Game
- October First OpenCyc build containing game data