Title: GPODS
1SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
Advisor I. Budak ArpinarCommittee Prashant
Doshi Robert J. Woods
11/27/2007
Delroy Cameron
Masters ThesisComputer Science, University of
Georgia
2OUTLINE
- Background
- Expertise Profiles
- Ranking Experts
- Collaboration Networks Expansion
- Results and Evaluation
- Conclusion
- Demo
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
3BACKGROUND
- Semantic Web
- What?
- Extension of current Web
- Attach Meaning to Data
- Why?
- Under Utilization of Current Web
- HTML Limitations
- Goal
- Enhance Information Exchange
- Automatic Information Discovery
- Interoperability of Services
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
4BACKGROUND
- Semantic Web
- Technologies
- XML
- RDF/RDFS/OWL
- URI
- Ontology
David Billington is a Professor of
Mathematics ltcourse nameMathematicsgt
ltlecturergtDavid Billingtonlt/lecturergt lt/coursegt lt
lecturer nameDavid Billingtongt
ltteachesgtMathematicslt/teachesgt lt/lecturergt ltteach
ingOfferinggt ltlecturergtDavid
Billingtonlt/lecturergt ltcoursegtMathematicslt/
coursegt lt/teachingOffering gt ltrdfDescription
rdfidmynamespaceProfessor_2gt
ltrdfhas_namegtDavid Billingtonlt/rdfhas_namegt
ltrdfteaches rdfresourceMathematics/gt lt/rdf
Descriptiongt
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
5BACKGROUND
- Semantic Web
- Common Challenges
- Entity Disambiguation
- Ontology Mapping/Alignment
- Trust/Provenance
- Semantic Association Discovery
- Application
- Social Networks
- Bio-Informatics
- National Security
- GPS Data Mining
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
6BACKGROUND
- Social Networks
- What?
- Connected through Social Relationships
- Characteristics
- Clustering Coefficient (connectedness to
neighbors)? - Centrality (average shortest path length)?
- Geodesic (shortest path length)?
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
7BACKGROUND
- Peer-Review Process
- What?
- Review scholarly manuscripts
- Challenges
- Slow
- Conflict of Interest
- Finding Suitable Reviewers
- Arbitrary Knowledge Approach
- Research Diversification
- Emerging Fields
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
8CONTRIBUTIONS
- Applicability of Semantics
- Finding Expertise
- Fine Levels of Granularity
- Finding Experts
- Taxonomy
- Collaboration Networks
- Discovery of Unknown Experts
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
9SEMEF
- SEMantic Expert Finder
- Finding Expertise (Expertise Profiles)?
- Collecting Expertise
- Quantifying Expertise
- Finding (Ranking) Experts
- w/ and w/o taxonomy
- Collaboration Networks
- Geodesic
- C-Nets
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
10EXPERTISE PROFILES
- Collecting Expertise
- Collect All Publication
- Map papers to topic
- Quantify all papers
- Publications Dataset
- DBLP 473,296 papers (conference/session names -
Nov. 2007)? - ACM, IEEE, Science Direct 29,454 papers
(abstracts/index terms)? - Combined 476,299 papers
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
11EXPERTISE PROFILES
- Collecting Expertise
- Papers-to-Topics Dataset
- Combined (476,299)?
- Topics (320)?
- Relationships (676,569)?
- Expertise Profiles (560,792)?
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
12EXPERTISE PROFILES
- Quantifying Expertise
- Mapping each paper to distinct value
- Publication Impact
- Hector Garcia-Molina (248 papers - 2003)?
- E. F. Codd (49 papers - 2003)?
- Citeseer Impact Statistics (1221 venues)?
- DBLP URIs
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
13EXPERTISE PROFILES
author_A
topic1 (4.50)?
topic2 (1.86)?
topic3 (3.08)?
paper1
paper2
paper3
paper4
paper6
paper5
1.54
1.54
1.10
1.86
1.54
1.86
Figure 1 Expertise Profile
14RANKING EXPERTS
- Taxonomy of Topics
- Session names
- Conference Names
- OCoMMA
- Paper Abstracts
- Index Terms
216
50
192
60
128
320
Figure 2 Taxonomy of Topics
15RANKING EXPERTS
- Case 1
- Single Topic without Taxonomy
- Traverse all Expertise Profiles
- Sum impact, (papers ? topics)?
- Case 2
- Single Topic with Taxonomy
- Traverse all Expertise Profiles
- Sum impact, (papers ? topics, subtopics)?
Prevent Expertise Overestimation 1) Map 2)
Papers to leaf nodes only
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
16RANKING EXPERTS
- Case 3
- Array of Topics without Taxonomy
- Same as Case 2
- Case 4
- Array of Topics with Taxonomy
- Filter input topics
- Sum impact, (papers ? topics, subtopics)?
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
17COLLABORATION NETWORKS EXPANSION
STRONG
WEAK
opusProceedings_543
opusArticle_in_Proceedings_179
opusisIncludedIn
opusisIncludedIn
opusauthor
opusauthor
opusArticle_in_Proceedings_35
opusArticle_in_Proceedings_8
author_B
author_A
opusauthor
opusauthor
author_A
author_B
opusArticle_in_Proceedings_291
opusArticle_in_Proceedings_3
opusauthor
opusauthor
opusauthor
opusauthor
author_B
author_A
author_1
author_B
author_2
author_A
MEDIUM
UNKNOWN
Figure 3 Geodesic Relationships
18COLLABORATION NETWORKS EXPANSION
- C-Net
- Ordering Cluster of Experts
- Collaboration Strength
coauthor_1 0.73, 0.5
coauthor_2 1.81, 1.0
coauthor_n 1.1, 0.8
Super Node 14.80
coauthor_5 1.54, 1.0
coauthor_3 0.73, 0.5
coauthor_4 0.73, 0.5
Figure 3 Geodesic Relationships
Newman, M. E. J. Coauthorship Networks and
Patterns of Scientific Collaboration. National
Academy of Sciences of the United States of
America, 1(101) 5200- 5205, (2004).
19RESULTS AND EVALUATION
- Evaluation
- WWW Search Track (2005/6/7)?
- Input Topics Call For Papers
- SWETO-DBLP Subset (67,366 authors)?
- DBLP (560,792)?
- Validation
- Collaboration Networks Expansion
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
20RESULTS AND EVALUATION
Table 1 Past PC Lists comparison with SEMEF
21RESULTS AND EVALUATION
Figure 4 Average Number of PC in SEMEF List
22RESULTS AND EVALUATION
Figure 5 Average PC Distribution in SEMEF List
23RESULTS AND EVALUATION
- Collaboration Networks Expansion
Table 3 PC Chair PC Member Geodesic
Relationships
Table 4 PC Chair SEMEF List Geodesic
Relationships
24CONCLUSION
- Expertise Profiles
- Publication Data
- Publication Impact Statistics
- Papers-to-Topics Relationships
- Ranking Experts
- w/ and w/o Taxonomy
- Single and Array of Topics
- Collaboration Networks Expansion
- Semantic Association Discovery
- Geodesic
- C-Nets
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
25DEMO
- Web Application
- Apache Tomcat 6.0
- Java Server Pages
- Ubuntu 7.10
Delroy Cameron
Masters ThesisComputer Science, University of
Georgia
26RELATED WORK
- Particle Swarm Algorithm
- ExpertiseNets
- Expertise Browser
- Experience Atoms
- Expertise Recommender
- Change history
- Tech Support Heuristics
- Profiling, Identification, Supervisor
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
27RELATED WORK
- Web-Based Communities
- Expert Rank
- Formal Probabilistic Models
- Candidate Models
- Document Models
- RDF-Matcher
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
28EXPERTISE PROFILE ALGORITHM
Algorithm findExpertiseProfile(researcherURI,
list of publications)? create empty expertise
profile foreach paper of researcher do
get topics list of paper (using
papers-to-topics dataset) get
publication impact if
publication impact is null do
publication impact ? default weight
else weight ?
publication impact existing weight from
expertise profile if expertise
profile contains topic do
update expertise profile with lttopic,
weightgt else
add lttopic, weightgt pair to expertise
profile end return expertise profile
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks
29RANKING EXPERTS ALGORITHM
Algorithm rankValue(researcherURI, list of
topics)? set expertRank to zero create temp
expertise profile filter topics foreach topic
in filtered topics list do get
papers for this topic (using papers-to-topics
dataset) foreach paper in papers
list do if researcher is
author do get
publication impact as weight
expertRankValue
expertRankValue publication impact
add lttopic, weightgt
pair to temporary expertise profile
end if end end return rankValue
SEMEF A Taxonomy-Based Discovery of Experts,
Expertise and Collaboration Networks