Title: An Introduction to Search Engine Algorithms
1An Introduction toSearch Engine Algorithms
2The Search Engines StartedBy Looking Primarily
at Keywords
3So... We Spammed Keywords
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras, Digital
cameras, Digital cameras, Digital cameras,
Digital cameras, Digital cameras
4Then They Used Specific Metricsfor Keyword Usage
5And We Spammed Those
6In 1998, Google PageRankArrived on the Scene
7PageRank is a Rough Measure of theAuthority and
Weight a Web Page Carries Based on All of the
Links Pointing to It
8The Algorithm Uses Markov Chaining to
Help Calculate Scoring Rankings and is
UniqueBecause it's the First to Measure Links
9And... Of Course...We Spam the Hell Out Of It
10We Buy Links fromHigh PageRank Domains
11We Build Link Networks of Useless, Junk Pages to
Create Fake PageRank Systemsto Point at the
Domains We Control
12We Trade Links, Use 3-Way Link Networks Send
Spam Email Telling Webmasters toBoost their
PageRank (By Linking to Us)?
13We Cloak by User-Agent, IP Address Cookie
Acceptance to Hide Our Junky Spam
14So Google Innovates...
15They Add Algo Pieces Like Anchor Text
I am linking to the best SEO Site - SEOmoz
Anchor Text
16And Beef Up Their Spam Fighting Teams to Shut
Down Link Networks
17But Those Crafty SEOs...They Keep On Spamming
18We Build Robots that Scan the Web for Unprotected
Pages to Place Links
19We use Crafty CSS Techniques to Hide Our Links
Text?
20So... The Search Engines Get Smarter
21They Assemble Teams of Human Quality Raters to
Help ID Spam
22They Launch Algorithms Specifically Targeted
Towards Individual Verticals
23And At Long Last, SEOs Begin to Relent
24What's InsideModern Search Algorithms?
25Relevance Detection Systems
26Keyword Usage
- Title Tags
- Headlines (H1, H2, H(x))?
- Text on the Page
- Link Anchor Text
- URLs
- Alt Meta Tags
27Term Vector Models
28Intent Matching
- Navigational Queries
- Research Queries
- Commercial Queries
- Vertical Queries(news, images, travel, blogs,
etc.)?
29Trust Popularity Detection Systems
30Link Popularity
- Popularity Trust in the Domain
- External Link Popularity of a Page
- Internal Link Popularity of a Page
- Quality Trust of Inbound Links
31Temporal User Data
- Search Query Data
- Timelines
- Toolbar Data
- Clickstream Data
- Google Analytics Data
- Feedburner Data
- Free Wifi Access Data
- Google's Q.D.F. Formula
32Spam Detection Systems
- Link Spam Detection
- Content Re-Purposing Detection
- Content Quality Detection
- Human Quality Control
33And Now, SomeLessons From Search Algos
34It Is Better to Be Big Popular Than Small
Niche
35Those With the Best (or Most Relevant)? Content
Don't Always Win
36Those With the Most Accessible(or
Search-Targeted Content)Don't Always Win
37Those with the Most Link-Worthy,
Viral-WorthyContent (that is also Search
Friendly)Are Most Likely to Win
38QATell Me More (about Algos)?
39You've Selected The Bonus!!!More Algo Goodness
)?
40When 37 Top SEOs were asked What factors
matter most at Google? What did they say?...
4110 Most Important Factor
429 Most Important Factor
438 Most Important Factor
447 Most Important Factor
456 Most Important Factor
465 Most Important Factor
474 Most Important Factor
483 Most Important Factor
492 Most Important Factor
501 Most Important Factor
51QATake Two?
52Wow! You Must Love This Stuff.Seriously? This
Rand Guy is a Total Geek.?
53Original PageRank
Page (a) (1 d ) d (a) (b) (d
/ 5) (1 - d ) d
(2) (4) (6) / (6) (7) (8) (9)
(10) (1 0.85)
0.85 0.41 0.41 (0.41 / 5)
(0.15) 0.85 (0.82
0.082) 0.15 0.77 0.92
54Local Authority vs. Global Authority
Teoma H.I.T.S.
Google PageRank
55The HillTop Algorithm
Google Switches from Raw Link Juice to Topical
Focus
56Information Retrieval Basedon Historical Data
Google Recruits the Pink Panther
57TrustRank Yahoo! Finds A Few Good Sites
We first select a small set of seed pages to be
evaluated by an expert. Once we manually identify
the reputable seed pages, we use the link
structure of the web to discover other pages that
are likely to be good. ...Our results show that
we can effectively filter out spam from a
significant fraction of the web, based on a good
seed set of less than 200 sites.
58Personalization - What's Good for theGoose may
not be Good for the Gander
Just cause they want worms doesn't mean I do.
59QAThis Time We're Actually Finished?