TREC Conference Blog Track - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

TREC Conference Blog Track

Description:

854 Ann Coulter 34. 871 Cindy Sheehan 43. Impact of Spam. Many ... Ann Coulter, Heineken, Netflix. Low scoring median performance for high-level concepts ... – PowerPoint PPT presentation

Number of Views:151
Avg rating:3.0/5.0
Slides: 42
Provided by: markme4
Category:
Tags: trec | ann | blog | conference | coulter | track

less

Transcript and Presenter's Notes

Title: TREC Conference Blog Track


1
TREC ConferenceBlog Track
  • Mark Merrill

2
Tracks 2006
  • Opinion Track
  • Focuses on the opinionated nature of blogs
  • Open Track
  • Goal to determine a suitable track for 2007

3
Test Collection Statistics
  • Quantity Value
  • Number of Unique Blogs 100,649
  • RSS 62
  • Atom 38
  • First Feed Crawl 06/12/2005
  • Last Feed Crawl 21/02/2006
  • Number of Feeds Fetches 753,681
  • Number of Permalinks 3,215,171
  • Number of Homepages 324,880
  • Total Compressed Size 25GB
  • Total Uncompressed Size 148GB
  • Feeds (Uncompressed) 38.6GB
  • Permalinks (Uncompressed) 88.8GB
  • Homepages (Uncompressed) 20.8GB

4
Goal of Opinion Track
  • Locate posts that express an opinion about a
    given target
  • Target does not have to be the same as topic of
    post
  • Can be in post or comments to post

5
Topics
  • 50 topics
  • Follows title, description, narrative format
  • Created from queries sent to commercial blog
    search engines
  • Title is formal query
  • Description and narrative fields are
    interpretations by assessor

6
Sample Topic
  • Number 871
  • cindy sheehan
  • Description
  • What has been the reaction to Cindy Sheehan and
    the demonstrations she has been involved in?
  • Narrative
  • Any favorable or unfavorable opinions of Cindy
    Sheehan are relevant. Reactions to the anti-war
    demonstrations she has organized or participated
    in are also relevant.

7
Assessment Procedure
  • Participants
  • Could create queries manually or automatically
  • Had up to 5 runs
  • Compulsory title-only run
  • Consisted of top 1,000 opinionated docs
  • Encouraged to submit manual runs
  • Asked to prioritize runs

8
Formation of Runs
  • Standard Retrieval units docs from permalinks
    component of collection
  • But participants free to use other components
  • Pools formed
  • 2 highest priority runs pooled to depth 100
  • All others pooled to depth 10

9
Ranking System
  • -1 Not Judged (offensive URL or header)
  • 0 Not Relevant
  • 1 Relevant
  • 2 Contains explicitly expressed negative opinion
  • 3 Contains positive and negative opinions
  • 4 Contains explicitly expressed positive opinion
  • Ambiguous opinions are defined as 3

10
Overview of Results
  • 14 groups took part
  • 57 submitted runs
  • 53 automatic
  • 4 manual
  • Metrics used
  • MAP (Mean Average Precision)
  • R-Precision
  • Binary Preference
  • Precision at 10

11
Results of Automatic RetrievalTitle Run
  • Group MAP R-prec bPref P_at_10
  • Univ. of Illinois at Chicago 0.1885 0.2771
    0.2693 0.5120
  • Indiana Univ. 0.1872 0.2562 0.2606
    0.4340
  • Tsinghua Univ. 0.1798 0.2647 0.2563
    0.3600
  • Univ. of Amsterdam 0.1795 0.2771 0.2625
    0.4640
  • CMU (Callan) 0.1576 0.2455 0.2458
    0.3580
  • Univ. of California, Santa Cruz 0.1549 0.2355
    0.2264 0.4380
  • Univ. of Maryland 0.1547 0.2106 0.2256
    0.3360
  • Univ. of Maryland B.C 0.0764 0.1307
    0.1202 0.2140
  • Univ. of Arkansas at Little Rock 0.0715
    0.1393 0.1357 0.3320
  • Univ. of Pisa 0.0700 0.1502 0.1535
    0.2880
  • Chinese Academy of Sciences 0.0621 0.1134
    0.1553 0.2000
  • National Institute of Informatics 0.0466
    0.1030 0.0851 0.3140
  • Robert Gordon Univ. 0.0000 0.0004 0.0003
    0.0000

12
Results of All Runs
  • Group Topics MAP R-prec bPref P_at_10
  • Indiana Univ. TDN 0.2052 0.2881 0.2934
    0.4680
  • Indiana Univ. TDN 0.2019 0.2934 0.2824
    0.4500
  • Univ. of Maryland TD 0.1887 0.2421 0.2573
    0.3780
  • Univ. of Illinois at Chicago T 0.1885 0.2771
    0.2693 0.5120
  • Tsinghua Univ. T 0.1798 0.2647 0.2563
    0.3600
  • Univ. of Amsterdam T 0.1795 0.2771 0.2625
    0.4640
  • CMU (Callan) T 0.1576 0.2455 0.2458 0.3580
  • Univ. of California, Santa Cruz T 0.1549
    0.2355 0.2264 0.4380
  • Fudan Univ. TDN 0.1179 0.1860 0.1920 0.2940
  • Univ. of Pisa TD 0.0873 0.1765 0.1620
    0.3400
  • Univ. of Maryland B.C. T 0.0764 0.1307
    0.1202 0.2140
  • Univ. of Arkansas at Little Rock T 0.0715
    0.1393 0.1357 0.3320
  • Chinese Academy of Sciences T 0.0621 0.1134
    0.1553 0.2000
  • National Institute of Informatics T 0.0466
    0.1030 0.0851 0.3140
  • Robert Gordon Univ. T 0.0000 0.0004 0.0003
    0.0000

13
Opinion vs. Topic MAPs
14
MAPs vs. Runs
15
Relevance Assessments
  • Relevance Label of Docs
  • Not Judged -1 0 0
  • Not Relevant 0 47491 70.5
  • Relevant 1 8361 12.4
  • Negative 2 3707 5.5
  • Mixed 3 3664 5.4
  • Positive 4 4159 6.2
  • (Total) - 67382 100

16
Impact of Spam
  • Attempted to determine spam infiltration
  • 17,958 splog feeds in Blog06 collection
  • Generated 509,137 posts
  • Less than 2 of splog posts were pooled
  • Most assumed splog docs not opinionated

17
Occurrence of Splogs
  • Relevance Scale of Splog Docs
  • Not Judged 0
  • Not Relevant 8348
  • Relevant 1004
  • Negative 191
  • Mixed 160
  • Positive 290
  • (Total) 9993

18
Impact of Spam
  • Topic Avg. of Retrievals
  • 899 Cholesterol 564
  • 893 Zyrtec 292
  • 854 Ann Coulter 34
  • 871 Cindy Sheehan 43

19
Impact of Spam
  • Many splogs are medical related
  • Very few splogs are related to people

20
Topics vs. Spam
21
Rank (by 50) vs. Spam Docs
22
Impact of Spam
  • Average spam ranked top 10 1.3
  • Spam has little effect on info retrieval systems

23
Polarity
  • Positive Recall Negative Recall
  • Best .7814 .7754
  • Median .3951 .4177
  • Best runs about the same between positive and
    negative
  • Median runs slightly more likely to get negative
    opinions

24
Topic Analysis
  • Median of relevant docs 329
  • Median of opinionated docs 182
  • Opinionated / Relevant 67

25
Topic Analysis
  • High scoring median performance for named
    entities
  • Ann Coulter, Heineken, Netflix
  • Low scoring median performance for high-level
    concepts
  • Cholesterol, Business Intelligence Resources

26
Approaches
  • Most implementations used a 2-part system
  • Ranked based on relevance
  • Reweighted for opinions

27
Opinion Reweighting
  • 3 Strategies
  • Dicionary-based
  • Text classification
  • Shallow linguistic approaches

28
Dictionary-based
  • List of terms, semantic orientation values used
    to rank docs
  • Examples like, love, hate, decent, awful
  • Based on frequency of keywords
  • Success varied

29
Text Classification
  • Used training data from sources known to be
    opinionated and unbiased sources
  • Opinionated product reviews
  • Unbiased online encyclopedias
  • Classifier trained and used to estimate the
    degree of opinionated content
  • Most favored Support Vector Machines

30
Shallow Linguistic Approach
  • Used frequency of pronouns or adjectives as
    indicators of opinionated content

31
Baseline Systems
  • Organizers tested the results of standard off the
    shelf IR systems
  • purpose compare against opinion searching
    models

32
MAPs of Various Systems
  • Systems Fields Opinion MAP Relevance MAP
  • Prise v3 TD 0.1858 0.2908
  • Terrier v 1.0.2 T 0.1696 0.2703
  • Terrier v 1.0.2 TD 0.2115 0.3151
  • Terrier v 1.0.2 TDN 0.1992 0.2892
  • Terrier v 1.0.2 TN 0.1655 0.2402
  • Indiana Univ. 0.2052 0.2983
  • Robert Gordon Univ. 0.0000 0.0001

33
Baseline Systems
  • MAPs suggest
  • Descriptive field is useful for retrieval
  • Narrative field is not useful

34
Open Task
  • Four proposals
  • Identify spam blogs in collection
  • Identification of emerging trends
  • Bloggers sense-of-self and changes over time are
    analyzed from blog posts
  • Identify whether two blog posts discuss the same
    topic

35
Identify Splogs
  • NEC Laboratories America
  • 2 tasks
  • Identification of splogs with fixed training and
    test sets
  • Adaptive splog identification task

36
Identify Splogs
  • University of Maryland Baltimore County and John
    Hopkins University
  • Collection is split in time, first part for
    training and second for testing
  • Identify splogs, suggest type of spamming method
  • Evaluate contribution of spam detection and
    removal

37
Emerging Trends
  • Robert Gordon University
  • Given a set of topics and training intervals,
    predict hot topics during a test interval
  • Potential indications of hot topic
  • Volume of discussion, posts or length
  • Relation between topic and other topics

38
Sense-of-Self
  • CSIRO ICT Centre
  • Identify bloggers who display substantial changes
    in sense-of self over time
  • Identify posts which contribute most to tracking
    these changes

39
Post comparison
  • National Institute of Informatics, Japan
  • Identify whether two blog posts discuss the same
    topic
  • given a set of pairs of blogs, return a decision
    whether two posts are linked
  • Applications summarization, related posts
    suggestion

40
Blog Track 2007
  • Opinion Track remains
  • Blog distillation task
  • Subtask determining polarity

41
Blog Track 2008
  • 4 Tasks
  • Baseline adhoc
  • Opinion Finding
  • Polarised Opinion Finding
  • Blog finding distillation task
Write a Comment
User Comments (0)
About PowerShow.com