Spatial Variation in Search Engine Queries - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Spatial Variation in Search Engine Queries

Description:

Information is becoming increasingly geographic as it becomes ... Star Tribune (Minneapolis) 1.289576. Houston Chronicle. 0.719161. Washington Post. 0.601810 ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 28

Provided by: lb591

Category:

Tags: engine | queries | search | spatial | variation

Transcript and Presenter's Notes

Title: Spatial Variation in Search Engine Queries

1
Spatial Variation in Search Engine Queries

Lars Backstrom, Jon Kleinberg, Ravi Kumar and
Jasmine Novak

2
Introduction

Information is becoming increasingly geographic
as it becomes easier to geotag all forms of data.
What sorts of questions can we answer with this
geographic data?
Query logs as case study here
Data is noisy. Is there enough signal? How can
we extract it.
Simple methods arent quite good enough, we need
a model of the data.

3
Introduction

Many topics have geographic focus
Sports, airlines, utility companies, attractions
Our goal is to identify and characterize these
topics
Find the center of geographic focus for a topic
Determine if a topic is tightly concentrated or
spread diffusely geographically
Use Yahoo! query logs to do this
Geolocation of queries based on IP address

4
Red Sox
5
Bell South
6
Comcast.com
7
Grand Canyon National Park
8
Outline

Probabilistic, generative model of queries
Results and evaluation
Adding temporal information to the model
Modeling more complex geographic query patterns
Extracting the most distinctive queries from a
location

9
Probabilistic Model

Consider some query term t
e.g. red sox
For each location x, a query coming from x has
probability px of containing t
Our basic model focuses on term with a center
hot-spot cell z.
Probability highest at z
px is a decreasing function of x-z
We pick a simple family of functions
A query coming from x at a distance d from the
terms center has probability px C d-a
Ranges from non-local (a 0) to extremely local
(large a)

10
Algorithm

Maximum likelihood approach allows us to evaluate
a choice of center, C and a
Simple algorithm finds parameters which maximize
likelihood
For a given center, likelihood is unimodal and
simple search algorithms find optimal C and a
Consider all centers on a course mesh, optimize
C and a for each center
Find best center, consider finer mesh

11
a 1.257
12
a 0.931
13
a 0.690
14
Comcast.com a 0.24
15
More Results (newspapers)

Term centers land correctly
Small a indicates nationwide appeal
Large a indicates local paper

16
More Results
17
Evaluation

Consider terms with natural correct centers
Baseball teams
Large US Cities
We compare with three other ways to find center
Center of gravity
Median
Most likely grid cell
Compute baseline rate for all queries
Compute likelihood of observations at
each0.1x0.1 grid cell
Pick cell with lowest likelihood of being from
baseline model

18
Baseball Teams and Cities

Our algorithm outperforms mean and median
Simpler likelihood method does better on baseball
teams
Our model must fit all nationwide data
Makes it less exact for short distances

19
Temporal Extension

We observe that the locality of some queries
changes over time
Query centers may move
Query dispersion may change (usually becoming
less local)
We examine a sequence of 24 hour time slices,
offset at one hour from each other
24 hours gives us enough data
Mitigates diurnal variation, as each slice
contains all 24 hours

20
Hurricane Dean

Biggest hurricaneof 2007
Computed optimalparameters for each time slice
Added smoothing term
Cost of moving from A to B in consecutive time
slices?A-B2
Center tracks hurricane, alpha decreases as storm
hits nationwide news

21
Multiple Centers

Not all queries fit the one-center model
Washington may mean the city of the state
Cardinals might mean the football team, the
baseball team, or the bird
Airlines have multiple hubs

We extend our algorithm to locate multiple
centers, each with its own C and a
Locations use the highest probability from any
center
To optimize
Start with K random centers, optimize with
1-center algorithm
Assign each point to the center giving it highest
probability
Re-optimize each center for only the points
assigned to it

22
United Airlines
23
Spheres of influence
24
Spheres of Influence

Each baseballteam assigneda color
A team with Nqueries in a cellgets NC votes
for its color
Map generated be taking weighted average of colors

25
Distinctive Queries

For each term and location
Find baseline rate p ofterm over entire map
Location has t totalqueries, s of them withterm
Probability given baseline rate isps(1-p)t-s
For each location, we find the highest deviation
from the baseline rate, as measured by the
baseline probability

26
(No Transcript)
27
Conclusions and Future Work

Large-scale query log data, combined with IP
location contains a wealth of geo-information
Combining geographic with temporal
Spread of ideas
Long-term trends
Using spatial data to learn more about regions
i.e. urban vs. rural

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Spatial Variation in Search Engine Queries PowerPoint PPT Presentation

Spatial Variation in Search Engine Queries - Spatial Variation in Search Engine Queries Lars Backstrom, Jon Kleinberg, Ravi Kumar and Jasmine Novak | PowerPoint PPT presentation | free to view

Wireless Sensor Networks: In Search of Principles PowerPoint PPT Presentation

Wireless Sensor Networks: In Search of Principles - Title: This Century Challenges: Sensor Networks for Environmental Monitoring Author: Deborah Estrin Last modified by: Deborah Estrin Created Date | PowerPoint PPT presentation | free to view

Structural Web Search Using a Graph-Based Discovery System PowerPoint PPT Presentation

Structural Web Search Using a Graph-Based Discovery System - Structural Web Search Using a Graph-Based Discovery System Nitish Manocha, Diane J. Cook, and Lawrence B. Holder University of Texas at Arlington | PowerPoint PPT presentation | free to view

The Failure of Clustering in Search Interfaces PowerPoint PPT Presentation

The Failure of Clustering in Search Interfaces - The Failure of Clustering in Search Interfaces or When/How/Why Clustering can be Successful in Search Interfaces Marti Hearst UC Berkeley Oct 6, 2004 | PowerPoint PPT presentation | free to view

Research in Semantic Web and Information Retrieval: Trust, Sensors, and Search PowerPoint PPT Presentation

Research in Semantic Web and Information Retrieval: Trust, Sensors, and Search - Research in Semantic Web and Information Retrieval: Trust, Sensors, and Search T. K. Prasad (Krishnaprasad Thirunarayan) Professor Kno.e.sis Center | PowerPoint PPT presentation | free to view

Top-k and Skyline Computation PowerPoint PPT Presentation

Top-k and Skyline Computation - Introduction na ve methods It is possible to process Top-k and Skyline queries by using simple algorithmic techniques. However, ... | PowerPoint PPT presentation | free to view

Wireless Sensor Networks: In Search of Principles PowerPoint PPT Presentation

Wireless Sensor Networks: In Search of Principles - Mote periodically emits coded acoustic 'chirps' (511 bits) IPAQs listen for chirps (buffer time series - mote ... (implemented on UCB Mote over RFM radio) ... | PowerPoint PPT presentation | free to view

Data Mining: Current Status and Research Directions PowerPoint PPT Presentation

Data Mining: Current Status and Research Directions - Text mining, Web mining and Weblog analysis. Spatial, multimedia, scientific data analysis ... customization: home page Weblog user profiles. 9/3/09. Data ... | PowerPoint PPT presentation | free to view

A knowledge map approach to the discovery of business intelligence on the Web PowerPoint PPT Presentation

A knowledge map approach to the discovery of business intelligence on the Web - But not scalable, because they rely on manual construction of Web directory. 6 ... A family of techniques that portray the data's structure in a spatial fashion ... | PowerPoint PPT presentation | free to view

Implementing a Traffic Assignment Heuristic in GIS: Exploring the Evacuation Problem PowerPoint PPT Presentation

Implementing a Traffic Assignment Heuristic in GIS: Exploring the Evacuation Problem - Network connectivity, spatial distribution of population, capacity at ... Arlington, Virginia, United States, Society for Industrial and Applied Mathematics. ... | PowerPoint PPT presentation | free to view

Simulation PowerPoint PPT Presentation

Simulation - Adaptive Workflow Engine. Adaptive Resource Management. Controller Designs ... Grid Computing Resources. Adaptive Wireless Data Receptor and Controller. Decisions ... | PowerPoint PPT presentation | free to view

From Question-Answering to Information-Seeking Dialogs PowerPoint PPT Presentation

From Question-Answering to Information-Seeking Dialogs - GEMINI parses and produces logical forms for most. TREC-type queries ... family and friendship relationships. movements and interactions. Actions/events: ... | PowerPoint PPT presentation | free to view

Incorporating Historical and Geographical Dimensions into a Search Interface PowerPoint PPT Presentation

Incorporating Historical and Geographical Dimensions into a Search Interface - Incorporating Historical and Geographical Dimensions into a Search Interface | PowerPoint PPT presentation | free to view

Content%20Level%20Access%20to%20Digital%20Library%20of%20India%20Pages PowerPoint PPT Presentation

Content%20Level%20Access%20to%20Digital%20Library%20of%20India%20Pages - Content Level Access to Digital Library of India Pages Praveen Krishnan, Ravi Shekhar, C.V. Jawahar CVIT, IIIT Hyderabad | PowerPoint PPT presentation | free to view

Overview of technological solutions to terminology services PowerPoint PPT Presentation

Overview of technological solutions to terminology services - Overview of technological solutions. to ... JISC Terminology Workshop, London, February 2004. Presentation ... BBCi, A day in the life of BBCi search. ... | PowerPoint PPT presentation | free to view

Structural Web Search Using a GraphBased Discovery System PowerPoint PPT Presentation

Structural Web Search Using a GraphBased Discovery System - Existing search engines use linear feature match. Web contains structural information as well ... Expand substructure by adding edge/vertex ... | PowerPoint PPT presentation | free to view

Discovery of Patterns in the Global Climate System using Data Mining PowerPoint PPT Presentation

Discovery of Patterns in the Global Climate System using Data Mining - Look up phone number in phone directory. Query a Web search engine for ... Number of analysts ... Global snapshots of values for a number of variables on land surfaces or ... | PowerPoint PPT presentation | free to view

Graph-based Learning and Discovery PowerPoint PPT Presentation

Graph-based Learning and Discovery - Title: Data Visualization Author: Diane J. Cook Last modified by: Diane J. Cook Created Date: 8/4/1999 4:46:29 PM Document presentation format: On-screen Show | PowerPoint PPT presentation | free to view

Data Warehousing/Mining Comp 150 DW Chapter 9. Mining Complex Types of Data PowerPoint PPT Presentation

Data Warehousing/Mining Comp 150 DW Chapter 9. Mining Complex Types of Data - Chapter 9. Mining Complex Types of Data. Multidimensional analysis and descriptive mining of ... The freehand method. Fit the curve by looking at the graph ... | PowerPoint PPT presentation | free to view

Top-k and Skyline Computation in Database Systems PowerPoint PPT Presentation

Top-k and Skyline Computation in Database Systems - Top-k and Skyline Computation in Database Systems Apostolos N. Papadopoulos Data Engineering Research Lab Department of Informatics, Aristotle University Thessaloniki ... | PowerPoint PPT presentation | free to view

Information Access for a Digital Library: Cheshire II and the Berkeley Environmental Digital Library Ray R. Larson School of Information Management PowerPoint PPT Presentation

Information Access for a Digital Library: Cheshire II and the Berkeley Environmental Digital Library Ray R. Larson School of Information Management - Cheshire II and the Berkeley Environmental Digital Library Ray R. Larson School of Information Management & Systems University of California, Berkeley | PowerPoint PPT presentation | free to view

Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 6 (book chapter 12): Multimedia IR: Indexing and Searching PowerPoint PPT Presentation

Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 6 (book chapter 12): Multimedia IR: Indexing and Searching - (1D) voice/music; (1D) time series: e.g., financial/marketing time series; DNA/genomic databases ... 11. Electronic Commerce & Internet Application Laboratory ... | PowerPoint PPT presentation | free to view

Geographic Privacy-aware Knowledge Discovery PowerPoint PPT Presentation

Geographic Privacy-aware Knowledge Discovery - Geographic Privacyaware Knowledge Discovery | PowerPoint PPT presentation | free to view

Graph-based Learning and Discovery PowerPoint PPT Presentation

Graph-based Learning and Discovery - Allow overlap between partitions. Run twice with two partitions, max results ... Combine 4 entries for each square into one. 30 tuples (one for each square) Discover ... | PowerPoint PPT presentation | free to view

SIMS 247 Information Visualization and Presentation PowerPoint PPT Presentation

SIMS 247 Information Visualization and Presentation - SIMS 247 Information Visualization and Presentation Marti Hearst March 15, 2002 Outline Why Text is Tough Visualizing Concept Spaces Clusters Category Hierarchies ... | PowerPoint PPT presentation | free to view

Data WarehousingMining Comp 150 DW Chapter 9. Mining Complex Types of Data PowerPoint PPT Presentation

Data WarehousingMining Comp 150 DW Chapter 9. Mining Complex Types of Data - Internet domain of parent pages. Image popularity. Dimensions. Mining Multimedia Databases ... Blue on top of white squared object is associated with brown bottom ... | PowerPoint PPT presentation | free to view

Environmental Data Warehousing and Mining PowerPoint PPT Presentation

Environmental Data Warehousing and Mining - Environmental Data Warehousing and Mining Nabil R. Adam Vijay Atluri, Dihua Guo, Songmei Yu Rutgers University CIMIC NSF Workshop on Next Generation Data Mining NGDM02 | PowerPoint PPT presentation | free to view