PageRank Algorithm and - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

PageRank Algorithm and

Description:

PageRank Algorithm and HITS Alogrithm Web Web DB ... – PowerPoint PPT presentation

Number of Views:611
Avg rating:3.0/5.0
Slides: 26
Provided by: naka55
Category:

less

Transcript and Presenter's Notes

Title: PageRank Algorithm and


1
PageRank Algorithm and HITS Alogrithm
2
Web?????????
  • Web?????????
  • ??DB??????????????????????????
  • tfidf????????????????????????????
  • Web??????????????????????????????
  • ????(???????????)?Web(??????)??????
  • ???????????????????????????????PageRank?????????

3
PageRank algorithm ?????
  • ???????????????????????? ????????????
  • ?????????????????????????????(????????????????????
    ???)
  • ?
  • ??????????????
  • ???????????????????????????
  • ???????????????????????????????????

4
?????????
10
10515
30
10
5
10
10
10
5
5
???????
  • ?????Pi????Pi???????????????????????????
  • ????????Pi??????????Pi???????????????????????????
  • ?
  • ???????????????????????????? R ???????????????????
    ????

6
?????????
  • ????A(aij)???
  • if ???i?????j???? then aij1/Ni
  • (Ni?i????????)
  • otherwise aij0
  • ???????? R ??????????
  • RcAR ???

7
??????1
  • 2?????????????????????????????????????????????????
    ??
  • ????????????????????????????????
  • ?????????????????????????????????
  • ?????(15)?????????????????
  • ?A??????????????????
  • R(cA(1-c)E)R
  • E?????1????c0.85?????

8
??????2
  • ?????????????????????????
  • ?????????????????????????? ? ????
  • ?????????????
  • PageRank?????????????Web??????????????????????????
    ???????????????????????????????

9
??????????
  • R(cA(1-c)E)R
  • E?????1????c0.85?????
  • ?????????????? R????????????????????????
  • R(0)???????
  • R(i1)? (cA(1-c)E) R(i)
  • D?R(i)?R(i1)
  • R(i1)?R(i1)dE
  • d? R(i1)?R(i)
  • if dgte then goto 2
  • ?? x?????????2????v

10
???
  • ??????????????(e0.001)
  • R? ????????????

1/2?0.18
a 0.365
b0.204
1/2?0.18
1/2?0.102
1?0.321
1/2? 0.102
c0.321
d0.110
11
??
  1. a,b,c,d ????????????????????
  2. ????????????????0.15??????????????????
  3. ?????c??????????a???????
  4. ????????????????c???????
  5. ???????????????b????????????d???????
  6. ?????????????????????????

12
???????
  • ???????????????????????????????????????
  • Google?10??????100????????????????????????????????
    ??
  • ???????????????????????????????????
  • Google????????????????
  • PC???????????
  • ?????????????????????????????????

13
?????(????????????) Lawrence Page, Sergey Brin,
Rajeev Motwani, Terry Winograd, 'The PageRank
Citation Ranking Bringing Order to the Web', 1998
  • 1??????
  • 107
  • 106
  • 105
  • 0 15
    30 45 ??????

????
161,000,000
322,000,0000
14
HITS algorithm?????
  • ??????Web?????????????????????
  • TOYOTA?HONDA??????????????????????????
  • ???yahoo??????????????????????
  • ?????????Web?????????(PageRank?????)
  • HITS??????????authorities????authorities???????
    hub ?????????

15
Focused Subgraph -1
  • ?????????????S???
  • S????????
  • ??????
  • ?????????????
  • ???????(???????)authority page ???
  • ????Q???????????????????????????????????????????

16
Focused Subgraph -2
  • ???????????20????????
  • java???????????????????????????15links
  • censorship???????????????????????????28links
  • ??link?20019939800????????
  • ????

17
Focused Subgraph -3
  • ??????????R???????
  • R???Q???????? t ???
  • SR
  • R????????????????(out going link)???? S ???
  • R????????????????(in coming link)?????? d
    ???????????? S ???
  • Kleinberg????? t200, d50?S?1000 5000???

18
Focused Subgraph -4
S
R
19
Hub ? Authority ???
  • S????????Q???????????????????????????
  • ????????????????????????????????????????
  • ?? authorioty ? hub ????????
  • Good hub pages points to many good authority
    pages, good authority pages are pointed to by
    many hubs.
  • ??domain???????????????(????????????)

20
weights
  • Authority weight of page
  • Hub weight of page
  • Link set
  • Normalization

21
iteration
  1. ???????k?????(k20??????)
  2. ???

22
???
b
a
c
d
  • Authority a3.4610-6, b0.408
  • c0.816 , d0.408
  • Hub a0.707, b0.707,
  • c5.98 10-6, d0

23
?????-1
  • ??????????????????????
  • ????
  • ???????

24
?????-2
  • ?????
  • ???

25
??Jon M. Klienberg?Authorities Sources in a
Hyperlinked Environment, JACM 46-5 ???
  • (java) Authorities
  • 0.328 http//www.gamelan.com
  • 0.251 http//java.sun.com/
  • 0.190 http//www.digitalfocus.com/digitalfocus/
    faq/howdoi.html
  • 0.190 http//lightyear.ncsa.unic.edu/srp
    /java/javabooks.html
Write a Comment
User Comments (0)
About PowerShow.com