Sunday, May 22, 2011

THE $25,000,000,000 EIGENVECTOR THE LINEAR ALGEBRA BEHIND GOOGLE



Google’s success derives in large part from its PageRank algorithm, which ranks the importance of webpages according to an eigenvector of a weighted link matrix. Analysis of the PageRank formula provides a wonderful applied topic for a linear algebra course.


PageRank algorithm, which quantitatively rates the importance of each page on the web.

Google calculates web page rankings using standard linear algebra application. It does three basic things:

1. Crawl the web and locate all web pages with public access.

2. Index the data from step 1, so that it can be searched efficiently for relevant keywords or phrases.

3. Rate the importance of each page the subset of pages in the database been found, the most important pages can be presented first.

The importance score for any web page will always be a non-negative real number.

An arrow from page A to page B indicates alink from page A to page B.

xk denotes the importance score of page k in the web

x1 = 2, x2 = 1, x3 = 3, and x4 = 2, so that page 3 would be the most important,

pages 1 and 4 tie for second, and page 2 is least important. A link to page k becomes a vote for

page k’s importance.

This approach ignores an important feature ranking algorithm that a link to page k from an important page should boost page k’s importance score more than a link from an unimportant page.


we will boost page k’s score by xj/nj , rather than by xj .

xk = ∑ xj/ nk

j ϵ Lk ( a link from a page to itself will not be counted.)
Let’s apply this approach :

x1 = x3/1 +

x4/2, since pages 3 and 4 are backlinks for page 1 and page 3 contains only one link, while page 4 contains two links (splitting its vote in half). Similarly, x2 = x1/3, x3 = x1/3 + x2/2 + x4/2, andx4 = x1/3 + x2/2. These linear equations can be written Ax = x, where x = [x1 x2 x3 x4]T

This transforms the web ranking problem into the “standard” problem of finding an eigenvector for a square matrix! (Recall : eigenvalues _ and eigenvectors x of a matrix A satisfy the equation Ax = λx, x != 0 by definition.)

x1 ~=0.387, x2~ = 0.129,x3 ~= 0.290, and x4 ~=.194.

Note that this ranking differs from that generated by simply

counting backlinks. It might seem surprising that page 3, linked to by all other pages, is not the most important. To understand this, note that page 3 links only to page 1 and so casts its entire vote for page 1. This, with the vote of page 2, results in page 1 getting the highest importance score.


[THE $25,000,000,000 EIGENVECTOR THE LINEAR ALGEBRA BEHIND GOOGLE
KURT BRYAN† AND TANYA LEISE
[ SIAM Review Volume 48 Issue 3, 2006
Society for Industrial and Applied Mathematics Philadelphia, PA, USA]]

1 comment:

  1. I thought linear algebra is just an abstract thoughts that has nothing to do with real life application! but after searching and seeing all the posts linear algebra make more sense. We are learning in a very fun way.
    Thanks Miss Zeina for introducing us to the blogging community :)

    ReplyDelete