Title: NSF Funding of LT resources
1NSF Funding of LT resources
- Tanya Korelsky, Program Director
- Robust Intelligence Cluster
- Division of Information and Intelligent Systems
- Directorate for Computer and Information Science
and Engineering - National Science Foundation
- tkorelsk_at_nsf.gov
- http//www.nsf.gov/
2How NSF is organized
Office of the Director
Biological Sciences
Geosciences
Computer and Information Sciences and Engineering
Mathematical and Physical Sciences
Education and Human Resources
Social, Behavioral And Economic Sciences
Engineering
3How CISE is organized
Office of the Director
Office of the Assistant Director for CISE
CCF Computing and Communications Foundations
CNS Computer and Network Systems
IIS Information and Intelligent Systems
OCI Office of Cyberinfra- structure
(formerly SCI, now with NSF-wide mission,
reporting to Director of NSF)
Clusters
Clusters
Clusters
Crosscutting Emphasis Areas
4(No Transcript)
5CISE Proposal/Award Statistics
FY Proposals Awards Funding Rate CGIs Supple-ments
2005 4,962 1,086 23 1,398 581
2004 6,266 1,017 16 1,297 400
2003 5,346 1,174 22 1,023 354
2002 4,314 1,038 24 918 308
2001 3,579 885 25 768 231
2000 2,853 903 32 547 210
1999 2,209 746 34 493 301
1998 1,885 667 35 476 211
1997 1,894 684 36 527 219
1996 1,760 601 34 610 183
1995 1,941 708 36 631 215
ADJUSTED
6CISE Budget 2003-2007
527M
525
Requested 6.1 increase includes 20M for
cybersecurity, 10M for GENI
Dollars in Millions
500
496M
475
2003
2004
2005
2006
2007Request
Fiscal Year
7The Human Language and Communication Program (HLC)
- Initiated by Dr. Mary Harper
- This HLC program emphasizes innovative advances
in computer and information sciences relating to
all forms of human communication. - High-level human communication topics
- Text Processing
- Speech Processing
- Multimodal Communication Processing
- HLC is attempting to strengthen current research
while broadening future research directions of
the language processing research community (e.g.,
multimodal communication).
8HLC/ITR LT recent resource, annotation and
evaluation metrics awards
- ITR 03 Collaborative effort on Interlingual
Annotation - HLC 04 Constructing an Enhanced Version of
WordNet, 100K (12 months) - HLC 05
- Rapid Development of Frame Semantic lexicon, to
ICSI, UC Berkeley, 400K (36 months) - SGER Learning Syntax-based Evaluation Metrics
for Machine Translation, Dr. Rebecca Hwa,
University of Pittsburgh, 200K (24 months) - A Framework for Learning High Accuracy Evaluation
Metrics for NLP Applications, Dr. Alon Lavie,
CMU, 150K (24 months) -
9CISE CRI (Computing Research Infrastructure)
Program
- Funds community resources for IIS programs
reviewers are supplied by the technical program
directors - 04 LT resource planning award to Vassar
College An Open Linguistic Infrastructure for
American English, 50K (12 month) - 05 LT resource/annotation awards
- Towards a Comprehensive Linguistic Annotation of
Language (Brandeis, UColorado, Pitt, Penn, NYU),
850K, 24 months goals include achieving an
international consensus on a meta-specification
framework - Another planning award (100K) to Vassar College
and Princeton University An Open Linguistic
infrastructure for American English goals
include annotation of semantic categories using
WordNet and FrameNet
10Information and Intelligent Systems
Reorganization into Clusters
- Robust Intelligence
- Artificial Intelligence, Human Language and
Communication, Robotics, Computer Vision,
Computational Neuroscience
- Human-centered Computing
- Human Computer Interaction, Social Informatics,
Universal Access - Information Integration and Informatics
- Data, Information, and Knowledge Management
Information Integration Science and Engineering
Informatics Digital Libraries Digital
Government
11Information and Intelligent Systems
- New Cluster-oriented Solicitation
- Scheduled to be published in May with submission
deadline late October early November - One of cross-cutting threads Human-Robot
Interaction - Implications for HLC area - renewed attention to
- dialogue (human-human, machine-human)
- ASR of imperfect and affected speech
- Speech-to-concept understanding
concept-to-speech generation - Need corpora to support these research areas!
12One Small Current Effort
- SGER (Small Grant for Exploratory Research)
- Creation of a Goal-Oriented, Human-Machine Spoken
Corpus - ICSI (UC Berkeley), Dr. Dillek Hakkani-Tur
- Building a spoken mixed-initiative dialogue
system for for conference services - Deploying the system for the IEEE SLT Workshop
(December 2006) - Collecting and annotating the dialogue corpus
13Digital Tools Summit at Michigan State University
(June 2006)
- Funded jointly by the Linguistics Program and
(former) HLC program - Addresses a functionality gap between the tools
that documentary linguists and typologists need
and the ability of existing tools to annotate
partially-understood linguistic data - Existing methods and tools presuppose a
regularized digital corpus of a well-understood
language and require a high degree of
computational sophistication - Aims to develop a roadmap for creating regional
and national language archives and the tools to
achieve it - Brings together theoretical computational
linguists and data-driven linguists to
brainstorm the challenging issues
14NSF perspective on funding LT resources
- New corpora for dialogue research
- New corpora for ASR research
- mixed language (English-Spanish)
- affected speech (911 calls) senior speech
- New general corpora (ANC), both text and speech
- Dependency treebanks and parsers
- Harmonization of existing semantic resources
(WordNet and FrameNet) - Basic research on semantic annotation ambivalent
attitude to standardization
15NSF perspective on funding LT resources
(international resources)
- Parallel corpora for new MT research on
statistical methods applied to syntactic and
semantic representations - Research on MT for minority languages (pending
award to CMU for Inupiaq and Aymara) - Corpora for research on language identification
- International collaboration on speech processing
(NYU-EBIRE- CNRS) and on unified linguistic
annotation - International workshop on dependency
representations (2007 ACL in Prague)
16Thank you
- Tanya Korelsky
- Robust Intelligence
- Human Language and Communication
- Division of Information and Intelligent Systems
- Directorate for Computer and Information Science
and Engineering - National Science Foundation
- tkorelsk_at_nsf.gov
- http//www.nsf.gov/
17Digital Living 2010
People across the globe will have access to each
other and information provided by pervasive
devices, embedded sensors and systems because all
will be connected to the Internet.
Thanks to David Kotz at Dartmouth
18Global Environment for Networking Innovations
(GENI)
- Limitations of the Internet
- Security mechanisms not included in the IP layer
- End-to-end robustness cannot be assumed or
assured - Scaling limitations
- Quality of service mechanisms have not diffused
widely in the public Internet - Support for new technologies difficult (e.g.,
wireless, mobility, sensors)
19Global Environment for Networking Innovations
- New networking and distributed system
architectures - Build in security and robustness
- Enabling pervasive computing, bridging the gap
between the physical and virtual worlds by
including mobile, wireless and sensor networks - Enable control and management of other critical
infrastructures - Include ease of operation and usability
- New classes of societal-level services and
applications
20Global Environment for Networking Innovations
- Research Program
- Supports research, design, and development of new
networking and distributed systems - Builds on many years of knowledge and experience,
but reexamine all networking assumptions and
reinvent where needed - Design for intended capabilities deploy and
validate architectures build new services and
applications - Encourage users to participate in experimentation
- Take a system-wide approach to the synthesis of
new architectures
21Global Environment for Networking Innovations
- Facility
- Shared use through slicing and virtualization
(where "slice" denotes the subset of resources
bound to a particular experiment) - Access to physical facilities through
programmable platforms (e.g., via customized
protocol stacks) - Large-scale user participation by "user opt-in"
and IP tunnels - Protection and collaboration among researchers by
controlled isolation and connection among slices - A broad range of investigations using new classes
of platforms and networks, a variety of access
circuits and technologies, and global control and
management software - Interconnection of independent facilities via
federated design.
22Global Environment for Networking Innovations
- Outreach
- CISE has supported numerous community workshops
in support of GENI - CISE is supporting on-going planning efforts,
including needs assessment and requirements for
the GENI Facility. - CISE will hold town meetings and continue to
support future workshops to broaden community
participation. - CISE will work with industry, other US agencies,
and international groups to broaden participation
in GENI beyond NSF and the US government.