Title: Yury Lifshits Yahoo! Research http://yury.name
1Yury LifshitsYahoo! Researchhttp//yury.name
St. Petersburg CS Club December 2008
Future of Search
2Outline
- Structured Search
- Yahoo! Work in Search
- SearchMonkey
- BOSS
- Research Agenda
3Structured Searchwork in progress
4Structured Search Bring structured data to
search users
M.K. Bergman. The Deep Web Surfacing Hidden
Value. 2001.
5Value Proposition
- Coverage
- Real-time data
- Semi-private data
- Structured queries
- Ordering and filtering results
- Straight-to-answers
6User Interface Query
- Search assist Yahoo!
- Selector LinkedIn, VKontakte.ru
- Multiple search buttons Gmail
- Search tabs Yahoo / Google
7User Interface Results
- Federated page
- Facets
- Search transfer / search form
K.P. Yee, K. Swearingen, K. Li, M. Hearst.
Faceted metadata for image search and browsing.
CHI 2003. Fernando Diaz. Aggregation of News
Content Into Web Results. WSDM 2009.
http//glue.yahoo.com http//au.alpha.yahoo.co
m
8(No Transcript)
9(No Transcript)
10Data Supply Chain
- Atomic fact
- Flight, Event, Patent
- Data aggregator
- US Patents, Amadeus/Sabre flights, Upcoming.com
- Domain search
- Expedia, Spock
- General purpose search
- Yahoo!, Google, Yandex, Baidu
11Getting structured data
- Entity extraction
- Markup
- Feeds
- Search API (OpenSearch)
- OR
- Do a search transfer
12Give Us Your Data For
- Traffic via search transfer
- Firefox search box
- Better presentation in search
- SearchMonkey
- Hosted search
- BOSS Custom
- Showing your ads
- Yahoo Local ATT
13Yahoo! Work in Search
14Slides by Paul Tarjan, Chief Technical
Monkey (ptarjan_at_yahoo-inc.com) Full version
http//www.slideshare.net/ptarjan/searchmonkey-pre
sentation
15What is SearchMonkey?
an open platform for using structured data to
build more useful and relevant search results
Before
After
16Enhanced Result Zagat
Key/Value Pairs or Abstract
Links
Image
17Infobar Wikipedia Preview
Summary
Blob
18Creating an Infobar
- Infobar advantages
- Annotate someone elses site
- Use links and images from other domains
- Mash up info from multiple sites
- Affiliate / coupon links? Hmmm
- Can act on , all websites
- But these apps can be annoying if poorly designed
- Key design principles
- Put something useful in the summary
- Be creative with the HTML
19How to get data to SearchMonkey?
- Humans see
- name
- picture of a person
- current job
- industry,
- Computers see
- an undifferentiated
- blob of HTML
- Can we make computers smarter?
20How does it work?
21SearchMonkey Resources
- Main
- http//developer.yahoo.com/searchmonkey
- Lists and forums
- searchmonkey-developers_at_yahoogroups.com
- http//suggestions.yahoo.com/searchmonkey
22Vik Singh (Architect)Graham Mudd (Senior PMM)
23What
BOSS Build your Own Search Service Open
Yahoos core search features via web services to
let 3rd parties revolutionize Search Unrestricted
24What
- Unrestricted
- Unlimited queries
- Blend, re-order, discard
- Full presentation control
- Non-search apps OK
- Monetization Free or CPM or Ads
25Why
- Barriers to entry are massive
- 300M, top talent, a prayer to get to basic
parity - No monopoly over great ideas
- Search anywhere
- Improve Vertical Quality w/ Web
comprehensiveness - Fragment the market, foster more players,
choice, competition - Yahoo extends advertising reach, 3rd parties
revenue share
26Why
BOSS Distribution
Traditional Search Distribution
27Tracks
API A self-service, web services model for
developers and start-ups to quickly build and
deploy new search experiences.
CUSTOM Working with 3rd parties to build a more
relevant, brand/site specific web search
experience. This option is jointly built by
Yahoo! and select partners.
ACADEMIC Working with the following
universities to allow for wide-scale research in
the search field
- UIUC
- CMU
- Stanford
- Purdue
- IIT Bombay
- MIT
- UMass
Interested in Custom? Email us bosscustom_at_yahoo-in
c.com
28BOSS API v1
http//boss.yahooapis.com/ysearch/vert/v1/q
vert web, news, images, spelling _at_
required appid _at_ optional (Y!OS
compliant) start, count, lang, region, format,
callback, sites
29BOSS Mashup Framework
Python (v2.5) library BOSS Search SDK plus
SQL for remixing arbitrary XML/JSON
sources Loosely Functional programming paradigm
30BMF Google App Engine
Ported enhanced version of BMF to GAE
platform http//zooie.wordpress.com/2008/08/04/ya
hoo-boss-google-app-engine-integrated/ Easiest
way to deploy a BOSS application online
31Examples
http//www.4hoursearch.com http//123people.com
Mashable! Contest for BOSS search
engines http//mashable.com/boss/
32BOSS Custom for TechCrunch
33TechCrunch Neywork Search
- CrunchBase Posts Web
- Sort by time / relevance
- Enhanced results
- Domain-specific facets
- Yahoo! sponsored search
- Real-time indexing
- Special results
34Research Agenda
35Structured Search
- Analysis of search demand
- Intent classification
- General search vs. vertical
- Incentives in data supply
- Push real-time indexing
- Search user interface
- One box vs. multi-box
- General vs. vertical
- Deciding search transfer
- When?
- To whom?
36Key Scientific ChallengesDraft
http//research.yahoo.com/ksc
- Search intent
- Quality metrics
- Web mining
- Multilingual IR
- Nextgen search
- Synthesized result pages
- World knowledge
A.Z. Broder. Taxonomy of web search. SIGIR 2002.
37More Problems
- Discovery search
- Web search vs. asking people
- Event search
38Thanks for your attention!
- Yury Lifshits
- http//yury.name
- yury_at_yury.name