AnHai Doan - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

AnHai Doan

Description:

crucial for integrating and sharing across data sources, the ... A long-standing challenge for both database and AI fields. many annual workshops in both fields ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 7
Provided by: zam3
Category:

less

Transcript and Presenter's Notes

Title: AnHai Doan


1
Research Overview
  • AnHai Doan
  • Dept. of Computer Science
  • Univ. of Illinois
  • Fall 2004

2
My Research Interests
Databases
Hard-core databases, XML Query optimizations,
Database theory
Growing Becoming increasingly critical
AI-ish databases Next-generation info
systems Semantic integration Data
integration Bridging text and databases
AI
Web IR
3
Semantic Integration
  • Schema matching
  • decide if column "location" in one database
    matches "address" in another database
  • Tuple matching
  • do (Mike Smith, 8, Champaign-IL) (M. Smith, 7,
    Illinois) refer to the same person?
  • Fundamental problems
  • crucial for integrating and sharing across data
    sources, the Web, enterprises, government
    agencies
  • extremely difficult "AI complete"
  • Strong area at IIlinois
  • with Dan Roth, Kevin Chang, Chengxiang Zhai,
    Jiawei Han

4
Data Integration
  • How to integrate databases, text, Web pages so
    that
  • users can pose structured queries, can interact
    with data as if with a single giant database, can
    quickly find desired information
  • A long-standing challenge for both database and
    AI fields
  • many annual workshops in both fields
  • Samples of my research
  • entity retrieval/integration find all
    information about a particular David Smith on the
    Web with C. Zhai
  • online community information integration can
    monitor person movements, products, etc.
  • One of the top data integration groups in the
    database field
  • with Kevin Chang
  • Chengxiang Zhai and Dan Roth also perform work in
    this area
  • works funded by NSF CAREER Awards

5
Bridging Text and Databases
  • In most domain
  • have both structured data and vast amount of text
  • Combining them is an emerging hot direction for
    both database and IR communities
  • My current focus
  • connect them at semantic level (e.g., linking
    mentions) with Dan Roth
  • efficient structured querying over text
  • instead of "find all potential enemies of
    Microsoft"
  • ask Q(x) - collaborate(Microsoft,y),
    competitor(y,x)
  • Leveraging top-notch expertise in text processing
    at Illinois
  • Dan Roth, Chengxiang Zhai
  • Combining it with database technologies

6
Summary
  • My focus
  • semantic integration
  • data integration
  • bridging text and databases
  • Fundamental problems in managing distributed,
    heterogeneous, semi-structured, and dynamic data
  • of interest to database, AI, IR, Web communities
  • Collaborating closely with the strong AI IR
    groups here
Write a Comment
User Comments (0)
About PowerShow.com