An Integrated Music Database - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

An Integrated Music Database

Description:

Existing music websites good at one or two services : Content: lyrics, ... Song Lyrics. Challenge: Integrate overlapping information. Amazon.com VS. epinions.com ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 18
Provided by: ewsU
Category:

less

Transcript and Presenter's Notes

Title: An Integrated Music Database


1
An Integrated Music Database
  • Advanced Database Management Systems
  • CS511
  • Instructor
  • ChengXiang Zhai
  • Xiao Hu (xiaohu)
  • Jian Wang (jwang32)
  • Joshua Lintz (lintz2) (Flushing, NY)

2
Goals
  • A DB Integrating information from influential
    music websites.
  • Existing music websites good at one or two
    services
  • Content lyrics,
  • Musical Metadata title, artist, genre, .
  • Sales prices, deals,
  • User-input information tags, reviews
  • Our database
  • Presents integrated info.
  • Grows with queries
  • Extends to new functions

3
Components
Query
User Interface
DB
Integrated Info
Data Integration
Query
Extracted Info
Wrapper
Information Extractor
Wrapper
Unstructured data
Semi-structured data
Adapted Query
Focus Crawler
Source site
Source site
Web
4
Source Web Sites (1)
Title
Artist
Rating
Prices
Album Picture
Editorial Review
5
Source Web Sites (2)
Title
Artist
Rating
Prices and sellers
Album Picture
Song List
6
Source Web Sites (3)
Artist Picture (tags)
Title
Album Tags
Similar Artists ( Info)
Song List ( and tags)
7
Source Web Sites (4)
Song Title
Artists
Album Title
Song Lyrics
8
Challenge
  • Integrate overlapping information
  • Amazon.com VS. epinions.com
  • Goals of integration
  • More info.
  • More accurate info.

9
Work Flow
10
Search result combination
UPC code
EID
search by UPC
Parse
Compare
Query
EID EID EID
Parse
11
Wrappers
URLs
HTML
Extracted Info.
Query
keywords
Wrappers
XML
12
DB schema
13
Demo
  • http//csil-projects.cs.uiuc.edu/jwang32/cgi-bin/
    start.pl

14
Challenges / Future Work
  • Efficiency
  • Multithreads
  • Dynamically changing websites
  • Content change Periodically update our DB
  • Schema change Learning patterns
  • Similar item calculation
  • Based on tags, and similarity among artists
  • Ranking items
  • Now borrow it from source sites
  • Ideally to have our own

15
Questions
  • Thank you!

16
Conclusion
  • The Internet has a plethora of information
    available that is stored in many different
    databases. We showed how we can make a program
    that integrates a variety of similar databases to
    give the user the desired search results.

17
Components
User Interface
Integrated Info
Query
DB
Data Integration
Extracted Info
Information Extractor 1
Information Extractor 2
Unstructured data
Semi-structured data
Focus Crawler
Focus Crawler
Source Site
Source Site
Web
Write a Comment
User Comments (0)
About PowerShow.com