The%20Harmony%20of%20Music%20and%20Computing - PowerPoint PPT Presentation

About This Presentation
Title:

The%20Harmony%20of%20Music%20and%20Computing

Description:

The Harmony of Music and Computing. Jantine Trapman. Expanding a Domain-Specific Database ... For each owl:Class in Music ontology. try to match with: ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 38
Provided by: jrtra
Category:

less

Transcript and Presenter's Notes

Title: The%20Harmony%20of%20Music%20and%20Computing


1
The Harmony of Music and Computing
Expanding a Domain-Specific Database
  • Jantine Trapman

2
Overview
  • Components
  • LT4eL
  • Cornetto
  • Creation / expansion of Music Ontology
  • Automatic Creation
  • Watson
  • Prompt
  • Mapping
  • Music Ontology
  • Cornetto

3
Components
  • LT4eL
  • Cornetto

4
Components LT4eL
  • Language Technology for eLearning
  • www.lt4el.eu
  • Development of search and management facilities
    in the LMS
  • Keyword Extractor
  • Glossary Candidate Finder
  • Semantic Search

5
Semantic Search
  • Based on
  • (multilingual) documents (LOs) for eight
    languages
  • semantic annotation of LOs
  • ontology
  • lexicon for each language involved
  • Corpus and ontology are restricted to Computing
    domain

6
Computing Ontology (1)
  • Creation
  • Manually annotated keywords in eight languages
    extracted from LOs
  • Translated into (English) concepts
  • Definitions collected on the WWW and added to
    concepts
  • Extension with additional concepts from
  • Restrictions on existing concepts
  • Superconcepts of existing concepts
  • Missing subconcepts
  • Annotation of LOs

7
Computing Ontology (2)
  • Domain ontology
  • Domain Computing
  • Manually created
  • 1406 concepts
  • 50 from DOLCE
  • 250 intermediate concepts from OntoWordNet
  • Use
  • Lexicon development for 8 languages
  • Semantic annotation LOs
  • LO indexing

8
Computing Ontology Part
9
Computing Lexicon
  • Concepts were translated in all languages
  • Each entry contains three types of information
  • Concept (and superconcept)
  • CDDrive (is-a Drive)
  • Definition
  • a drive that reads a compact disc and that is
    connected to an audio system
  • Set of terms in a given language
  • CD-speler, CD drive

10
Expansion of the LT4eL KB
  • Future more domains needed
  • Task
  • Expansion ontology and lexicons
  • Preferably semi-automatic
  • Three options
  • Top-down
  • Bottom-up
  • Both, ingredients
  • Cornetto, WordNet
  • Music ontology
  • Watson, Prompt

11
Cornetto
  • Combinatorial and Relational
  • Network as Toolkit for Dutch Language Technology
  • Referentie Bestand Nederlands (RBN)
  • ? lexical units
  • Dutch part of EuroWordNet
  • Dutch WordNet (DWN)
  • ? synsets
  • SUMO/MILO plus extensions
  • ? terms and axioms
  • Core table of Cornetto Identifiers (CIDs)

http//www.let.vu.nl/onderzoek/projectsites/cornet
to/index.html
12
Example Lexical Entry Cornetto (1)
noun zanger noun zanger
Sense CID
Iemand die zingt c_n-42316
Vogel die zingt c_n-42317
(Poëtisch voor) dichter c_n-42318

13
noun zanger1 c_n-42316
  • Morphology
  • typederivation structurezingener
    plurformszangers
  • Syntax
  • genderm/f articlede
  • Semantics
  • referencecommon countabilitycount
    typehuman subclassberoepsnaam/beoefenaar
    resumeiemand die zingt
  • Pragmatics
  • domainmuz

14
Example Lexical Entry Cornetto (2)
  • Combinatorics zanger1
  • De redacteur van het woordenboek was ook een
    zanger
  • De zanger van de band
  • SUMO (, , hasSkill)
  • Synonyms
  • zanger, zangeres
  • HAS_HYPERONYM musicus, musicienne, muzikant
  • HAS_HYPONYM baszanger, sopraan, blueszanger,
    charmezanger, ...
  • Equivalence relations EQ_SYNONYM singer,
    vocalist, vocalizer, vocaliser /ENG20-09908715-n ?
    link with WordNet 2.0!
  • WordNet Domains music

15
Goal
16
Tasks
  • Extract music related terms from Cornetto
  • Create a domain ontology for Music
  • Map between terms from lexicon and concepts in
    ontology
  • Map music ontology to OntoWN and DOLCE
  • Adjust Cornetto data to LT4eL format

17
Questions (1)
  1. How can we automatize the process of ontology
    building and to which extent?
  2. How can we profit from existing resources from
    the Semantic Web to enrich ontologies?
  3. To which extent do Watson and PROMPT support the
    reuse of existing resources?

18
Music Ontology
  • Automatic Creation
  • Expansion with
  • Watson
  • Prompt

19
Automatic Creation (1)
  • (Basili et al. 2007) automatic ontology
    extraction from open-domain corpus (BNC)
  • Designed for three tasks
  • lexical ambiguity resolution within a specific
    domain
  • restricting a set of terms to a subset relevant
    for an ontology to be constructed
  • expanding this new ontology with other, novel and
    relevant concepts, relations and instances.

20
Automatic Creation (2)
  • Preprocessing
  • Corpus split in 40 sentence text segments
  • PoS tagging
  • Filtering of noun phrases
  • General steps
  • Term extraction through Latent Semantic Analysis
    (Deerwester et al. 1990)
  • Ontology extraction from WordNet based on
    Conceptual Density (Agirre and Rigau 1996)

21
Music Ontology Part
22
Music Ontology (Basili et al. 07)
  • 46 primitive classes
  • Leaf concepts have a synset ID from WordNet
  • No properties, only super-/subconcept relation
  • So.. a rather small and shallow ontology
  • expansion by exploiting Semantic Web techniques

23
(No Transcript)
24
Watson (1)
  • http//watson.kmi.open.ac.uk/WatsonWUI/
  • Every URI is clickable all resources are
    available
  • Information about
  • Size
  • Representation language
  • Number of classes, properties, individuals etc.
  • Review rating
  • Interface for SPARQL queries
  • Possibility of (upwards) navigation

25
Watson (2)
  • Also available as
  • Protégé plug-in (under development)
  • API
  • New concepts can be added
  • Manually
  • One by one
  • Much human action required
  • Faster than creation from scratch, but still a
    tedious exercise

26
Watson (3)
  • Watson provides in
  • a list of URIs of available semantic databases
  • a list of candidate concepts
  • What is still lacking
  • a (semi-)automatic way to merge or align new
    concepts or ontologies to an existing one.
  • Possible solution Prompt

27
PROMPT (1)
  • http//protege.stanford.edu/plugins/prompt/prompt.
    html
  • Protégé plug-in
  • Functionalities
  • Comparison
  • Inclusion
  • Merging
  • Alignment
  • Requirement ontologies for merge etc. must be
    available offline
  • Prompt goes beyond purely syntactic matching
  • Evaluation shows that experts followed 90 of
    Prompts suggestions

28
Prompt (2)
  • Saves time and effort
  • linguistically similar classes are found quickly
  • inherited properties and subclasses can be added
    automatically
  • similar structures are automatically detected
  • automatic consistency check
  • Resources must have the exact same markup
    language
  • Merging
  • faster but more complex
  • requires good insight in resources

29
Mapping
  • Music Ontology
  • Cornetto

30
Resources
  • Music Ontology
  • Some nodes have WordNet ID (from the automatic
    process
  • Many havent, especially those added with Watson
  • Cornetto entries
  • have synset ID from Dutch WN
  • have mapping to WordNet entry through equivalence
    or near-equivalence e.g.

31
Questions (2)
  • To which extent does WordNet support a mapping
    between
  • The Cornetto lexicon and a newly created ontology
    partly based on Wordnet
  • The existing ontology and lexicon from LT4eL, and
    Cornetto ontology

32
Procedures
  • A concept either has or has not a WN synset ID
  • Mapping via WordNet synset ID
  • Lookup synset ID in Cornetto
  • Establish related DWN synset(s)
  • Results until now without problems although
    near-equivalence relations are expected to give
    mismatches
  • Mapping without synset ID
  • Syntactic matching of conceptname with terms from
    WordNet synsets
  • compare definitions and glosses

33
Examples easy match
  • zanger1 d_n-20810 (iemand die zingt) is
  • EQ_SYNONYM of
  • singer, vocalist, vocalizer, vocaliser
    /ENG20-09908715-n (a person who sings )
  • strijkkwartet1 d_n-14287
  • (ensemble van vier strijkers) and
  • strijkkwartet2 d n-19905
  • (ensemble voor vier strijkers) are
  • EQ_NEAR_SYNONYM of
  • soloist1/ENG20-09931035
  • Note Cornetto contains mismatch between WN and
    DWN

34
Matching without ID (1)
  • For each owlClass in Music ontology
  • try to match with
  • target attribute in relation element of Cornetto
    XML structure, where
  • Attribute relation_name is (EQ_)NEAR_SYNONYM e.g.
  • Add synset ID to concept (for mapping to
    OntoWordNet)
  • ltowlClass rdfabouthttp///myOntos/music.owlor
    chestra"/gt
  • ltrelation relation_name"EQ_NEAR_SYNONYM"
    target20-previewtext"symphony orchestra1,
    symphony2" version"pwn_1_6" target20"ENG20-0775
    0308-n" target"ENG16-06123240-n"gt

35
Matching without ID (2)
  • Compare definitions and glosses
  • many ontology classes have a definition
  • each WN synset has a gloss
  • preprocess stemming and filtering nouns
  • Consider percentage of nouns in concept
    definition that match with a certain gloss
  • Evaluate results
  • Note some definitions are equal to WN glosses

36
Current work
  • Matching without ID on class name and
    definitions/glosses
  • Manually check results for precision and recall
  • Problem MWEs, e.g. class Brass_Instrument
  • has no precise WN counterpart, but
  • Brass does exist, but
  • it has multiple senses ? how can we disambiguate?
  • Question ID allows easy and reliable match, but
    can we do the task without?

37
Remaining and Future work
  • Attuning format lexicon to LT4eL format
  • Mapping to OntoWordNet (semi-automatic)
  • Mapping to DOLCE (manual task)
  • Ontology evaluation
  • Experiments with WordNets from different
    languages
  • Involve additional lexical info to improve LT4eL
    search engine e.g. use morphological info about
    plural forms
Write a Comment
User Comments (0)
About PowerShow.com