Posted By

CKOink on 04/09/11

HTML4

/ Published in: HTML   `THE \$25,000,000,000Ã¢ï¿½ï¿½ EIGENVECTORTHE LINEAR ALGEBRA BEHIND GOOGLEKURT BRYANÃ¢ï¿½Â  AND TANYA LEISEÃ¢ï¿½Â¡Abstract. GoogleÃ¢ï¿½ï¿½s success derives in large part from its PageRank algorithm, which ranks the importanceof webpages according to an eigenvector of a weighted link matrix. Analysis of the PageRank formula provides awonderful applied topic for a linear algebra course. Instructors may assign this article as a project to more advancedstudents, or spend one or two lectures presenting the material with assigned homework from the exercises. Thismaterial also complements the discussion of Markov chains in matrix algebra. Maple and Mathematica Ã¯Â¬ï¿½les supportingthis material can be found at www.rose-hulman.edu/Ã¢ï¿½Â¼bryan.Key words. linear algebra, PageRank, eigenvector, stochastic matrixAMS subject classiÃ¯Â¬ï¿½cations. 15-01, 15A18, 15A511. Introduction. When Google went online in the late 1990Ã¢ï¿½ï¿½s, one thing that set it apartfrom other search engines was that its search result listings always seemed deliver the Ã¢ï¿½ï¿½good stuÃ¯Â¬ï¿½Ã¢ï¿½ï¿½up front. With other search engines you often had to wade through screen after screen of linksto irrelevant web pages that just happened to match the search text. Part of the magic behindGoogle is its PageRank algorithm, which quantitatively rates the importance of each page on theweb, allowing Google to rank the pages and thereby present to the user the more important (andtypically most relevant and helpful) pages Ã¯Â¬ï¿½rst.Understanding how to calculate PageRank is essential for anyone designing a web page that theywant people to access frequently, since getting listed Ã¯Â¬ï¿½rst in a Google search leads to many peoplelooking at your page. Indeed, due to GoogleÃ¢ï¿½ï¿½s prominence as a search engine, its ranking systemhas had a deep inÃ¯Â¬ï¿½uence on the development and structure of the internet, and on what kinds ofinformation and services get accessed most frequently. Our goal in this paper is to explain one ofthe core ideas behind how Google calculates web page rankings. This turns out to be a delightfulapplication of standard linear algebra.Search engines such as Google have to do three basic things:1. Crawl the web and locate all web pages with public access.2. Index the data from step 1, so that it can be searched eÃ¯Â¬ï¿½ciently for relevant keywords orphrases.3. Rate the importance of each page in the database, so that when a user does a search andthe subset of pages in the database with the desired information has been found, the moreimportant pages can be presented Ã¯Â¬ï¿½rst.This paper will focus on step 3. In an interconnected web of pages, how can one meaningfully deÃ¯Â¬ï¿½neand quantify the Ã¢ï¿½ï¿½importanceÃ¢ï¿½ï¿½ of any given page?The rated importance of web pages is not the only factor in how links are presented, but it is asigniÃ¯Â¬ï¿½cant one. There are also successful ranking algorithms other than PageRank. The interestedreader will Ã¯Â¬ï¿½nd a wealth of information about ranking algorithms and search engines, and we listjust a few references for getting started (see the extensive bibliography in , for example, for amore complete list). For a brief overview of how Google handles the entire process see , and foran in-depth treatment of PageRank see  and a companion article . Another article with goodconcrete examples is . For more background on PageRank and explanations of essential principlesof web design to maximize a websiteÃ¢ï¿½ï¿½s PageRank, go to the websites [4, 11, 14]. To Ã¯Â¬ï¿½nd out moreabout search engine principles in general and other ranking algorithms, see  and . Finally, foran account of some newer approaches to searching the web, see  and .2. Developing a formula to rank pages.Ã¢ï¿½ï¿½THE APPROXIMATE MARKET VALUE OF GOOGLE WHEN THE COMPANY WENT PUBLIC IN 2004.Ã¢ï¿½Â Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute, IN 47803; email:[email protected]; phone: 812) 877-8485; fax: (812)877-8883.Ã¢ï¿½Â¡Mathematics and Computer Science Department, Amherst College, Amherst, MA 01002; email:[email protected]; phone: (413)542-5411; fax: (413)542-2550.12 K. BRYAN AND T. LEISE2 41 32 41 3Fig. 2.1. An example of a web with only four pages. An arrow from page A to page B indicates a link from pageA to page B.2.1. The basic idea. In what follows we will use the phrase Ã¢ï¿½ï¿½importance scoreÃ¢ï¿½ï¿½ or just Ã¢ï¿½ï¿½scoreÃ¢ï¿½ï¿½for any quantitative rating of a web pageÃ¢ï¿½ï¿½s importance. The importance score for any web page willalways be a non-negative real number. A core idea in assigning a score to any given web page isthat the pageÃ¢ï¿½ï¿½s score is derived from the links made to that page from other web pages. The linksto a given page are called the backlinks for that page. The web thus becomes a democracy wherepages vote for the importance of other pages by linking to them.Suppose the web of interest contains n pages, each page indexed by an integer k, 1 Ã¢ï¿½Â¤ k Ã¢ï¿½Â¤ n. Atypical example is illustrated in Figure 2.1, in which an arrow from page A to page B indicates alink from page A to page B. Such a web is an example of a directed graph.1 WeÃ¢ï¿½ï¿½ll use xk to denotethe importance score of page k in the web. The xk is non-negative and xj > xk indicates that page jis more important than page k (so xj = 0 indicates page j has the least possible importance score).A very simple approach is to take xk as the number of backlinks for page k. In the example inFigure 2.1, we have x1 = 2, x2 = 1, x3 = 3, and x4 = 2, so that page 3 would be the most important,pages 1 and 4 tie for second, and page 2 is least important. A link to page k becomes a vote forpage kÃ¢ï¿½ï¿½s importance.This approach ignores an important feature one would expect a ranking algorithm to have,namely, that a link to page k from an important page should boost page kÃ¢ï¿½ï¿½s importance score morethan a link from an unimportant page. For example, a link to your homepage directly from Yahooought to boost your pageÃ¢ï¿½ï¿½s score much more than a link from, say, www.kurtbryan.com (no relationto the author). In the web of Figure 2.1, pages 1 and 4 both have two backlinks: each links tothe other, but page 1Ã¢ï¿½ï¿½s second backlink is from the seemingly important page 3, while page 4Ã¢ï¿½ï¿½ssecond backlink is from the relatively unimportant page 1. As such, perhaps we should rate page1Ã¢ï¿½ï¿½s importance higher than that of page 4.As a Ã¯Â¬ï¿½rst attempt at incorporating this idea letÃ¢ï¿½ï¿½s compute the score of page j as the sum of thescores of all pages linking to page j. For example, consider the web of Figure 2.1. The score of page1 would be determined by the relation x1 = x3 + x4. Since x3 and x4 will depend on x1 this schemeseems strangely self-referential, but it is the approach we will use, with one more modiÃ¯Â¬ï¿½cation. Justas in elections, we donÃ¢ï¿½ï¿½t want a single individual to gain inÃ¯Â¬ï¿½uence merely by casting multiple votes.In the same vein, we seek a scheme in which a web page doesnÃ¢ï¿½ï¿½t gain extra inÃ¯Â¬ï¿½uence simply bylinking to lots of other pages. If page j contains nj links, one of which links to page k, then we willboost page kÃ¢ï¿½ï¿½s score by xj/nj , rather than by xj . In this scheme each web page gets a total of onevote, weighted by that web pageÃ¢ï¿½ï¿½s score, that is evenly divided up among all of its outgoing links. Toquantify this for a web of n pages, let Lk Ã¢ï¿½ï¿½ {1, 2, . . . , n} denote the set of pages with a link to pagek, that is, Lk is the set of page kÃ¢ï¿½ï¿½s backlinks. For each k we requirexk =XjÃ¢ï¿½ï¿½Lkxjnj, (2.1)where nj is the number of outgoing links from page j (which must be positive since if j Ã¢ï¿½ï¿½ Lk then1A graph consists of a set of vertices (in this context, the web pages) and a set of edges. Each edge joins a pairof vertices. The graph is undirected if the edges have no direction. The graph is directed if each edge (in the webcontext, the links) has a direction, that is, a starting and ending vertex.THE \$25,000,000,000 EIGENVECTOR 352 41 352 41 3Fig. 2.2. A web of Ã¯Â¬ï¿½ve pages, consisting of two disconnected Ã¢ï¿½ï¿½subwebsÃ¢ï¿½ï¿½ W1 (pages 1 and 2) and W2 (pages 3,4, 5).page j links to at least page k!). We will assume that a link from a page to itself will not be counted.In this Ã¢ï¿½ï¿½democracy of the webÃ¢ï¿½ï¿½ you donÃ¢ï¿½ï¿½t get to vote for yourself !LetÃ¢ï¿½ï¿½s apply this approach to the four-page web of Figure 2.1. For page 1 we have x1 = x3/1 +x4/2, since pages 3 and 4 are backlinks for page 1 and page 3 contains only one link, while page 4contains two links (splitting its vote in half ). Similarly, x2 = x1/3, x3 = x1/3 + x2/2 + x4/2, andx4 = x1/3 + x2/2. These linear equations can be written Ax = x, where x = [x1 x2 x3 x4]TandA =Ã¯Â£Â®Ã¯Â£Â¯Ã¯Â£Â°0 0 112130 0 0131201213120 0Ã¯Â£Â¹Ã¯Â£ÂºÃ¯Â£Â»(2.2) .This transforms the web ranking problem into the Ã¢ï¿½ï¿½standardÃ¢ï¿½ï¿½ problem of Ã¯Â¬ï¿½nding an eigenvectorfor a square matrix! (Recall that the eigenvalues ÃŽÂ» and eigenvectors x of a matrix A satisfy theequation Ax = ÃŽÂ»x, x =6 0 by deÃ¯Â¬ï¿½nition.) We thus seek an eigenvector x with eigenvalue 1 for thematrix A. We will refer to A as the Ã¢ï¿½ï¿½link matrixÃ¢ï¿½ï¿½ for the given web.It turns out that the link matrix A in equation (2.2) does indeed have eigenvectors with eigenvalue 1, namely, all multiples of the vector [12 4 9 6]T(recall that any non-zero multiple of aneigenvector is again an eigenvector). LetÃ¢ï¿½ï¿½s agree to scale these Ã¢ï¿½ï¿½importance score eigenvectorsÃ¢ï¿½ï¿½so that the components sum to 1. In this case we obtain x1 =1231 Ã¢ï¿½ï¿½ 0.387, x2 =431 Ã¢ï¿½ï¿½ 0.129,x3 =931 Ã¢ï¿½ï¿½ 0.290, and x4 =631 Ã¢ï¿½ï¿½ 0.194. Note that this ranking diÃ¯Â¬ï¿½ers from that generated by simplycounting backlinks. It might seem surprising that page 3, linked to by all other pages, is not themost important. To understand this, note that page 3 links only to page 1 and so casts its entirevote for page 1. This, with the vote of page 2, results in page 1 getting the highest importance score.More generally, the matrix A for any web must have 1 as an eigenvalue if the web in questionhas no dangling nodes (pages with no outgoing links). To see this, Ã¯Â¬ï¿½rst note that for a general webof n pages formula (2.1) gives rise to a matrix A with Aij = 1/nj if page j links to page i, Aij = 0otherwise. The jth column of A then contains nj non-zero entries, each equal to 1/nj , and thecolumn thus sums to 1. This motivates the following deÃ¯Â¬ï¿½nition, used in the study of Markov chains:Definition 2.1. A square matrix is called a column-stochastic matrix if all of its entriesare nonnegative and the entries in each column sum to one.The matrix A for a web with no dangling nodes is column-stochastic. We now proveProposition 1. Every column-stochastic matrix has 1 as an eigenvalue. Proof. Let A be annÃƒï¿½n column-stochastic matrix and let e denote an n dimensional column vector with all entries equalto 1. Recall that A and its transpose AThave the same eigenvalues. Since A is column-stochasticit is easy to see that ATe = e, so that 1 is an eigenvalue for ATand hence for A. In what follows we use V1(A) to denote the eigenspace for eigenvalue 1 of a column-stochasticmatrix A.2.2. Shortcomings. Several diÃ¯Â¬ï¿½culties arise with using formula (2.1) to rank websites. In thissection we discuss two issues: webs with non-unique rankings and webs with dangling nodes.2.2.1. Non-Unique Rankings. For our rankings it is desirable that the dimension of V1(A)equal one, so that there is a unique eigenvector x withPixi = 1 that we can use for importancescores. This is true in the web of Figure 2.1 and more generally is always true for the special case of4 K. BRYAN AND T. LEISEa strongly connected web (that is, you can get from any page to any other page in a Ã¯Â¬ï¿½nite numberof steps); see Exercise 10 below.Unfortunately, it is not always true that the link matrix A will yield a unique ranking for allwebs. Consider the web in Figure 2.2, for which the link matrix isA =Ã¯Â£Â®Ã¯Â£Â¯Ã¯Â£Â°0 1 0 0 01 0 0 0 00 0 0 1120 0 1 0120 0 0 0 0Ã¯Â£Â¹Ã¯Â£ÂºÃ¯Â£Â».We Ã¯Â¬ï¿½nd here that V1(A) is two-dimensional; one possible pair of basis vectors is x = [1/2, 1/2, 0, 0, 0]Tand y = [0, 0, 1/2, 1/2, 0]TBut note that any linear combination of these two vectors yields another .vector in V1(A), e.g.,34x +14y = [3/8, 3/8, 1/8, 1/8, 0]TIt is not clear which, if any, of these .eigenvectors we should use for the rankings!It is no coincidence that for the web of Figure 2.2 we Ã¯Â¬ï¿½nd that dim(V1(A)) > 1. It is aconsequence of the fact that if a web W, considered as an undirected graph (ignoring which directioneach arrows points), consists of r disconnected subwebs W1, . . . , Wr, then dim(V1(A)) Ã¢ï¿½Â¥ r, and hencethere is no unique importance score vector x Ã¢ï¿½ï¿½ V1(A) withPixi = 1. This makes intuitive sense: ifa web W consists of r disconnected subwebs W1, . . . , Wr then one would expect diÃ¯Â¬ï¿½culty in Ã¯Â¬ï¿½ndinga common reference frame for comparing the scores of pages in one subweb with those in anothersubweb.Indeed, it is not hard to see why a web W consisting of r disconnected subwebs forces dim(V1(A)) Ã¢ï¿½Â¥r. Suppose a web W has n pages and r component subwebs W1, . . . , Wr. Let ni denote the numberof pages in WiIndex the pages in W1 with indices 1 through n1, the pages in W2 with indices .n1 + 1 through n1 + n2, the pages in W3 with n1 + n2 + 1 through n1 + n2 + n3, etc. In general,let Ni =Pij=1nj for i Ã¢ï¿½Â¥ 1, with N0 = 0, so Wi contains pages NiÃ¢ï¿½ï¿½1 + 1 through Ni,For example .in the web of Figure 2 we can take N1 = 2 and N2 = 5, so W1 contains pages 1 and 2, W2 containspages 3, 4, and 5. The web in Figure 2.2 is a particular example of the general case, in which thematrix A assumes a block diagonal structureA =Ã¯Â£Â®Ã¯Â£Â¯Ã¯Â£Â°A1 0 . . . 00 A2 0 00.....0 .0 0 0 ArÃ¯Â£Â¹Ã¯Â£ÂºÃ¯Â£Â»,where Ai denotes the link matrix for Wi.In fact, Wi can be considered as a web in its own right .Each ni Ãƒï¿½ ni matrix Aiis column-stochastic, and hence possesses some eigenvector viÃ¢ï¿½ï¿½ lRni witheigenvector 1. For each i between 1 and r construct a vector wiÃ¢ï¿½ï¿½ lRnwhich has 0 components forall elements corresponding to blocks other than block i. For example,w1=Ã¯Â£Â«Ã¯Â£Â¬Ã¯Â£Â­v100...0Ã¯Â£Â¶Ã¯Â£Â·Ã¯Â£Â¸, w2=Ã¯Â£Â«Ã¯Â£Â¬Ã¯Â£Â­0v20...0Ã¯Â£Â¶Ã¯Â£Â·Ã¯Â£Â¸. . . ,Then it is easy to see that the vectors wi, 1 Ã¢ï¿½Â¤ i Ã¢ï¿½Â¤ r, are linearly independent eigenvectors for ATHE \$25,000,000,000 EIGENVECTOR 5with eigenvalue 1 becauseAwi= AÃ¯Â£Â«Ã¯Â£Â¬Ã¯Â£Â­0...0vi0...0Ã¯Â£Â¶Ã¯Â£Â·Ã¯Â£Â¸= wi.Thus V1(A) has dimension at least r.2.2.2. Dangling Nodes. Another diÃ¯Â¬ï¿½culty may arise when using the matrix A to generaterankings. A web with dangling nodes produces a matrix A which contains one or more columns ofall zeros. In this case A is column-substochastic, that is, the column sums of A are all less than orequal to one. Such a matrix must have all eigenvalues less than or equal to 1 in magnitude, but1 need not actually be an eigenvalue for A. Nevertheless, the pages in a web with dangling nodescan still be ranked use a similar technique. The corresponding substochastic matrix must have apositive eigenvalue ÃŽÂ» Ã¢ï¿½Â¤ 1 and a corresponding eigenvector x with non-negative entries (called thePerron eigenvector) that can be used to rank the web pages. See Exercise 4 below. We will notfurther consider the problem of dangling nodes here, however.Exercise 1. Suppose the people who own page 3 in the web of Figure 1 are infuriated by thefact that its importance score, computed using formula (2.1), is lower than the score of page 1. Inan attempt to boost page 3Ã¢ï¿½ï¿½s score, they create a page 5 that links to page 3; page 3 also links to page5. Does this boost page 3Ã¢ï¿½ï¿½s score above that of page 1?Exercise 2. Construct a web consisting of three or more subwebs and verify that dim(V1(A))equals (or exceeds) the number of the components in the web.Exercise 3. Add a link from page 5 to page 1 in the web of Figure 2. The resulting web,considered as an undirected graph, is connected. What is the dimension of V1(A)?Exercise 4. In the web of Figure 2.1, remove the link from page 3 to page 1. In the resultingweb page 3 is now a dangling node. Set up the corresponding substochastic matrix and Ã¯Â¬ï¿½nd its largestpositive (Perron) eigenvalue. Find a non-negative Perron eigenvector for this eigenvalue, and scalethe vector so that components sum to one. Does the resulting ranking seem reasonable?Exercise 5. Prove that in any web the importance score of a page with no backlinks is zero.Exercise 6. Implicit in our analysis up to this point is the assertion that the manner in whichthe pages of a web W are indexed has no eÃ¯Â¬ï¿½ect on the importance score assigned to any given page.Prove this, as follows: Let W contains n pages, each page assigned an index 1 through n, and letA be the resulting link matrix. Suppose we then transpose the indices of pages i and j (so page i isnow page j and vice-versa). LetÃ‹ï¿½A be the link matrix for the relabelled web.Ã¢ï¿½Â¢ Argue thatÃ‹ï¿½A = PAP, where P is the elementary matrix obtained by transposing rows i andj of the n Ãƒï¿½ n identity matrix. Note that the operation A Ã¢ï¿½ï¿½ PA has the eÃ¯Â¬ï¿½ect of swappingrows i and j of A, while A Ã¢ï¿½ï¿½ AP swaps columns i and j. Also, P2 = I, the identitymatrix.Ã¢ï¿½Â¢ Suppose that x is an eigenvector for A, so Ax = ÃŽÂ»x for some ÃŽÂ». Show that y = Px is aneigenvector forÃ‹ï¿½A with eigenvalue ÃŽÂ».Ã¢ï¿½Â¢ Explain why this shows that transposing the indices of any two pages leaves the importancescores unchanged, and use this result to argue that any permutation of the page indices leavesthe importance scores unchanged.3. A remedy for dim(V1(A)) > 1. An enormous amount of computing resources are neededto determine an eigenvector for the link matrix corresponding to a web containing billions of pages.It is thus important to know that our algorithm will yield a unique set of sensible web rankings.The analysis above shows that our Ã¯Â¬ï¿½rst attempt to rank web pages leads to diÃ¯Â¬ï¿½culties if the webisnÃ¢ï¿½ï¿½t connected. And the worldwide web, treated as an undirected graph, contains many disjointcomponents; see  for some interesting statistics concerning the structure of the web.6 K. BRYAN AND T. LEISEBelow we present and analyze a modiÃ¯Â¬ï¿½cation of the above method that is guaranteed to overcomethis shortcoming. The analysis that follows is basically a special case of the Perron-Frobeniustheorem, and we only prove what we need for this application. For a full statement and proof of thePerron-Frobenius theorem, see chapter 8 in .3.1. A modiÃ¯Â¬ï¿½cation to the link matrix A. For an n page web with no dangling nodeswe can generate unambiguous importance scores as follows, including cases of web with multiplesubwebs.Let S denote an n Ãƒï¿½ n matrix with all entries 1/n. The matrix S is column-stochastic, and it iseasy to check that V1(S) is one-dimensional. We will replace the matrix A with the matrixM = (1 Ã¢ï¿½ï¿½ m)A + mS, (3.1)where 0 Ã¢ï¿½Â¤ m Ã¢ï¿½Â¤ 1. M is a weighted average of A and S. The value of m originally used by Googleis reportedly 0.15 [9, 11]. For any m Ã¢ï¿½ï¿½ [0, 1] the matrix M is column-stochastic and we show belowthat V1(M) is always one-dimensional if m Ã¢ï¿½ï¿½ (0, 1]. Thus M can be used to compute unambiguousimportance scores. In the case when m = 0 we have the original problem, for then M = A. At theother extreme is m = 1, yielding M = S. This is the ultimately egalitarian case: the only normalizedeigenvector x with eigenvalue 1 has xi = 1/n for all i and all web pages are rated equally important.Using M in place of A gives a web page with no backlinks (a dangling node) the importancescore of m/n (Exercise 9), and the matrix M is substochastic for any m < 1 since the matrix A issubstochastic. Therefore the modiÃ¯Â¬ï¿½ed formula yields nonzero importance scores for dangling links(if m > 0) but does not resolve the issue of dangling nodes. In the remainder of this article, we onlyconsider webs with no dangling nodes.The equation x = Mx can also be cast asx = (1 Ã¢ï¿½ï¿½ m)Ax + ms, (3.2)where s is a column vector with all entries 1/n. Note that Sx = s ifPixi = 1.We will prove below that V1(M) is always one-dimensional, but Ã¯Â¬ï¿½rst letÃ¢ï¿½ï¿½s look at a couple ofexamples.Example 1: For the web of four pages in Figure 2.1 with matrix A given by (2.2), the newformula gives (with m = 0.15)M =Ã¯Â£Â®Ã¯Â£Â¯Ã¯Â£Â°0.0375 0.0375 0.8875 0.46250.3208Ã‚Â¯3 0.0375 0.0375 0.03750.3208Ã‚Â¯3 0.4625 0.0375 0.46250.3208Ã‚Â¯3 0.4625 0.0375 0.0375Ã¯Â£Â¹Ã¯Â£ÂºÃ¯Â£Â»,and yields importance scores x1 Ã¢ï¿½ï¿½ 0.368, x2 Ã¢ï¿½ï¿½ 0.142, x3 Ã¢ï¿½ï¿½ 0.288, and x4 Ã¢ï¿½ï¿½ 0.202. This yields thesame ranking of pages as the earlier computation, but the scores are slightly diÃ¯Â¬ï¿½erent.Example 2 shows more explicitly the advantages of using M in place of A.Example 2: As a second example, for the web of Figure 2.2 with m = 0.15 we obtain the matrixM =Ã¯Â£Â®Ã¯Â£Â¯Ã¯Â£Â°0.03 0.88 0.03 0.03 0.030.88 0.03 0.03 0.03 0.030.03 0.03 0.03 0.88 0.4550.03 0.03 0.88 0.03 0.4550.03 0.03 0.03 0.03 0.03Ã¯Â£Â¹Ã¯Â£ÂºÃ¯Â£Â»(3.3) .The space V1(M) is indeed one-dimensional, with normalized eigenvector components of x1 =0.2, x2 = 0.2, x3 = 0.285, x4 = 0.285, and x5 = 0.03. The modiÃ¯Â¬ï¿½cation, using M instead of A,allows us to compare pages in diÃ¯Â¬ï¿½erent subwebs.Each entry Mij of M deÃ¯Â¬ï¿½ned by equation (3.1) is strictly positive, which motivates the followingdeÃ¯Â¬ï¿½nition.Definition 3.1. A matrix M is positive if Mij > 0 for all i and j. This is the key propertythat guarantees dim(V1(M)) = 1, which we prove in the next section.THE \$25,000,000,000 EIGENVECTOR 73.2. Analysis of the matrix M. Note that Proposition 1 shows that V1(M) is nonemptysince M is stochastic . The goal of this section is to show that V1(M) is in fact one-dimensional.This is a consequence of the following two propositions.Proposition 2. If M is positive and column-stochastic, then any eigenvector in V1(M) has allpositive or all negative components. Proof. We use proof by contradiction. First note that in thestandard triangle inequality |PiyiÃ¢ï¿½Â¥ |Pi|yi| (with all yi real) the inequality is strict when the yiare of mixed sign. Suppose x Ã¢ï¿½ï¿½ V1(M) contains elements of mixed sign. From x = Mx we havexi =Pnj=1 Mijxj and the summands Mijxj are of mixed sign (since Mij > 0). As a result we havea strict inequality|xi= |nXj=1Mijxj>nXj=1Mij |xj |. (3.4)Sum both sides of inequality (3.4) from i = 1 to i = n, and swap the i and j summations. Then usethe fact that M is column-stochastic (Pi Mij = 1 for all j) to Ã¯Â¬ï¿½ndnXi=1|xi> |nXi=1nXj=1Mij |xj | =nXj=1 nXi=1Mij!|xj | =nXj=1|xj |,a contradiction. Hence x cannot contain both positive and negative elements. If xi Ã¢ï¿½Â¥ 0 for all i (andnot all xi are zero) then xi > 0 follows immediately from xi =Pnj=1 Mijxj and Mij > 0. Similarlyxi Ã¢ï¿½Â¤ 0 for all i implies that each xi < 0. The following proposition will also be useful for analyzing dim(V1(M)):Proposition 3. Let v and w be linearly independent vectors in lRm, m Ã¢ï¿½Â¥ 2. Then, for somevalues of s and t that are not both zero, the vector x = sv + tw has both positive and negativecomponents. Proof. Linear independence implies neither v nor w is zero. Let d =PiviIf d = 0 .then v must contain components of mixed sign, and taking s = 1 and t = 0 yields the conclusion.If d 6= 0 set s = Ã¢ï¿½ï¿½Pi wid P , t = 1, and x = sv + tw. Since v and w are independent x =6 0. However,ixi = 0. We conclude that x has both positive and negative components. We can now prove that using M in place of A yields an unambiguous ranking for any web withno dangling nodes.Lemma 3.2. If M is positive and column-stochastic then V1(M) has dimension 1. Proof. Weagain use proof by contradiction. Suppose there are two linearly independent eigenvectors v and win the subspace V1(M). For any real numbers s and t that are not both zero, the nonzero vectorx = sv + tw must be in V1(M), and so have components that are all negative or all positive. Butby Proposition 3, for some choice of s and t the vector x must contain components of mixed sign,a contradiction. We conclude that V1(M) cannot contain two linearly independent vectors, and sohas dimension one. Lemma 3.2 provides the Ã¢ï¿½ï¿½punchlineÃ¢ï¿½ï¿½ for our analysis of the ranking algorithm using the matrixM (for 0 < m < 1). The space V1(M) is one-dimensional, and moreover, the relevant eigenvectorshave entirely positive or negative components. We are thus guaranteed the existence of a uniqueeigenvector x Ã¢ï¿½ï¿½ V1(M) with positive components such thatPixi = 1.Exercise 7. Prove that if A is an n Ãƒï¿½ n column-stochastic matrix and 0 Ã¢ï¿½Â¤ m Ã¢ï¿½Â¤ 1, thenM = (1 Ã¢ï¿½ï¿½ m)A + mS is also a column-stochastic matrix.Exercise 8. Show that the product of two column-stochastic matrices is also column-stochastic.Exercise 9. Show that a page with no backlinks is given importance scoremnby formula (3.2).Exercise 10. Suppose that A is the link matrix for a strongly connected web of n pages(any page can be reached from any other page by following a Ã¯Â¬ï¿½nite number of links). Show thatdim(V1(A)) = 1 as follows. Let (Ak)ij denote the (i, j)-entry of Ak.Ã¢ï¿½Â¢ Note that page i can be reached from page j in one step if and only Aij > 0 (since Aij > 0means thereÃ¢ï¿½ï¿½s a link from j to i!) Show that (A2)ij > 0 if and only if page i can be reachedfrom page j in exactly two steps. Hint: (A2)ij =Pk AikAkj ; all Aij are non-negative, so(A2)ij > 0 implies that for some k both Aik and Akj are positive.8 K. BRYAN AND T. LEISEÃ¢ï¿½Â¢ Show more generally that (Ap)ij > 0 if and only if page i can be reached from page j inEXACTLY p steps.Ã¢ï¿½Â¢ Argue that (I + A + A2 + Ã‚Â· Ã‚Â· Ã‚Â· + Ap)ij > 0 if and only if page i can be reached from page jin p or fewer steps (note p = 0 is a legitimate choiceÃ¢ï¿½ï¿½any page can be reached from itselfin zero steps!)Ã¢ï¿½Â¢ Explain why I + A + A2 + Ã‚Â· Ã‚Â· Ã‚Â· + AnÃ¢ï¿½ï¿½1is a positive matrix if the web is strongly connected.Ã¢ï¿½Â¢ Use the last part (and Exercise 8) so show that B =1n(I + A + A2 + Ã‚Â· Ã‚Â· Ã‚Â· + AnÃ¢ï¿½ï¿½1) is positiveand column-stochastic (and hence by Lemma 3.2, dim(V1(B)) = 1).Ã¢ï¿½Â¢ Show that if x Ã¢ï¿½ï¿½ V1(A) then x Ã¢ï¿½ï¿½ V1(B). Why does this imply that dim(V1(A)) = 1?Exercise 11. Consider again the web in Figure 2.1, with the addition of a page 5 that links topage 3, where page 3 also links to page 5. Calculate the new ranking by Ã¯Â¬ï¿½nding the eigenvector ofM (corresponding to ÃŽÂ» = 1) that has positive components summing to one. Use m = 0.15.Exercise 12. Add a sixth page that links to every page of the web in the previous exercise, butto which no other page links. Rank the pages using A, then using M with m = 0.15, and comparethe results.Exercise 13. Construct a web consisting of two or more subwebs and determine the rankinggiven by formula (3.1).At present the web contains at least eight billion pagesÃ¢ï¿½ï¿½how does one compute an eigenvectorfor an eight billion by eight billion matrix? One reasonable approach is an iterative procedure calledthe power method (along with modiÃ¯Â¬ï¿½cations) that we will now examine for the special case at hand.It is worth noting that there is much additional analysis one can do, and many improved methodsfor the computation of PageRank. The reference  provides a typical example and additionalreferences.4. Computing the Importance Score Eigenvector. The rough idea behind the powermethod2for computing an eigenvector of a matrix M is this: One starts with a Ã¢ï¿½ï¿½typicalÃ¢ï¿½ï¿½ vectorx0, then generates the sequence xk = MxkÃ¢ï¿½ï¿½1 (so xk = Mkx0) and lets k approaches inÃ¯Â¬ï¿½nity. Thevector xk is, to good approximation, an eigenvector for the dominant (largest magnitude) eigenvalueof M. However, depending on the magnitude of this eigenvalue, the vector xk may also growwithout bound or decay to the zero vector. One thus typically rescales at each iteration, say bycomputing xk =MxkÃ¢ï¿½ï¿½1kMxkÃ¢ï¿½ï¿½1k, where k Ã‚Â· k can be any vector norm. The method generally requires thatthe corresponding eigenspace be one-dimensional, a condition that is satisÃ¯Â¬ï¿½ed in the case when Mis deÃ¯Â¬ï¿½ned by equation (3.1).To use the power method on the matrices M that arise from the web ranking problem we wouldgenerally need to know that any other eigenvalues ÃŽÂ» of M satisfy |ÃŽÂ»| < 1. This assures that thepower method will converge to the eigenvector we want. Actually, the following proposition provideswhat we need, with no reference to any other eigenvalues of M!Definition 4.1. The 1-norm of a vector v is kvk1 =Pi|vi.|Proposition 4. Let M be a positive column-stochastic n Ãƒï¿½ n matrix and let V denote thesubspace of lRnconsisting of vectors v such thatPjvj = 0. Then Mv Ã¢ï¿½ï¿½ V for any v Ã¢ï¿½ï¿½ V , andkMvk1 Ã¢ï¿½Â¤ ckvk1for any v Ã¢ï¿½ï¿½ V , where c = max1Ã¢ï¿½Â¤jÃ¢ï¿½Â¤n |1 Ã¢ï¿½ï¿½ 2 min1Ã¢ï¿½Â¤iÃ¢ï¿½Â¤n Mij | < 1. Proof. To see that Mv Ã¢ï¿½ï¿½ V isstraightforward: Let w = Mv, so that wi =Pnj=1 Mijvj andnXi=1wi =nXi=1nXj=1Mijvj =nXj=1vj nXi=1Mij!=nXj=1vj = 0.Hence w = Mv Ã¢ï¿½ï¿½ V . To prove the bound in the proposition note thatkwk1 =nXi=1eiwi =nXi=1eiÃ¯Â£Â«Ã¯Â£Â­nXj=1MijvjÃ¯Â£Â¶Ã¯Â£Â¸ ,2See  for a general introduction to the power method and the use of spectral decomposition to Ã¯Â¬ï¿½nd the rate ofconvergence of the vectors xk = Mkx0.THE \$25,000,000,000 EIGENVECTOR 9where ei = sgn(wi). Note that the ei are not all of one sign, sincePi wi = 0 (unless w Ã¢ï¿½Â¡ 0 in whichcase the bound clearly holds). Reverse the double sum to obtainkwk1 =nXj=1vj nXi=1eiMij!=nXj=1ajvj , (4.1)where aj =Pni=1eiMij . Since the ei are of mixed sign andPi Mij = 1 with 0 < Mij < 1, it is easyto see thatÃ¢ï¿½ï¿½1 < Ã¢ï¿½ï¿½1 + 2 min1Ã¢ï¿½Â¤iÃ¢ï¿½Â¤nMij Ã¢ï¿½Â¤ aj Ã¢ï¿½Â¤ 1 Ã¢ï¿½ï¿½ 2 min1Ã¢ï¿½Â¤iÃ¢ï¿½Â¤nMij < 1.We can thus bound|aj | Ã¢ï¿½Â¤ |1 Ã¢ï¿½ï¿½ 2 min1Ã¢ï¿½Â¤iÃ¢ï¿½Â¤nMij | < 1.Let c = max1Ã¢ï¿½Â¤jÃ¢ï¿½Â¤n |1 Ã¢ï¿½ï¿½ 2 min1Ã¢ï¿½Â¤iÃ¢ï¿½Â¤n Mij |. Observe that c < 1 and |aj | Ã¢ï¿½Â¤ c for all j. From equation(4.1) we havekwk1 =nXj=1ajvj =nXj=1ajvjÃ¢ï¿½Â¥nXj=1|aj ||vj | Ã¢ï¿½Â¤ cnXj=1|vj | = ckvk1,which proves the proposition.Proposition 4 sets the stage for the following proposition.Proposition 5. Every positive column-stochastic matrix M has a unique vector q with positivecomponents such that Mq = q with kqk1 = 1. The vector q can be computed as q = limkÃ¢ï¿½ï¿½Ã¢ï¿½ï¿½ Mkx0for any initial guess x0 with positive components such that kx0k1 = 1. Proof. From Proposition1 the matrix M has 1 as an eigenvalue and by Lemma 3.2 the subspace V1(M) is one-dimensional.Also, all non-zero vectors in V1(M) have entirely positive or negative components. It is clear thatthere is a unique vector q Ã¢ï¿½ï¿½ V1(M) with positive components such thatPiqi = 1.Let x0 be any vector in lRnwith positive components such that kx0k1 = 1. We can writex0 = q + v where v Ã¢ï¿½ï¿½ V (V as in Proposition 4). We Ã¯Â¬ï¿½nd that Mkx0 = Mkq + Mkv = q + Mkv.As a resultMkx0 Ã¢ï¿½ï¿½ q = Mkv. (4.2)A straightforward induction and Proposition 4 shows that kMkvk1 Ã¢ï¿½Â¤ ckkvk1 for 0 Ã¢ï¿½Â¤ c < 1 (c as inProposition 4) and so limkÃ¢ï¿½ï¿½Ã¢ï¿½ï¿½ kMkvk1 = 0. From equation (4.2) we conclude that limkÃ¢ï¿½ï¿½Ã¢ï¿½ï¿½ Mkx0 =q. Example: Let M be the matrix deÃ¯Â¬ï¿½ned by equation (3.3) for the web of Figure 2.2. We takex0 = [0.24, 0.31, 0.08, 0.18, 0.19]Tas an initial guess; recall that we had q = [0.2, 0.2, 0.285, 0.285, 0.03]T.The table below shows the value of kMkx0 Ã¢ï¿½ï¿½ qk1 for several values of k, as well as the ratiokMkx0 Ã¢ï¿½ï¿½ qk1/kMkÃ¢ï¿½ï¿½1x0 Ã¢ï¿½ï¿½ qk1. Compare this ratio to c from Proposition 4, which in this case is0.94.k kMkx0 Ã¢ï¿½ï¿½ qk1kMkx0Ã¢ï¿½ï¿½qk1kMkÃ¢ï¿½ï¿½1x0Ã¢ï¿½ï¿½qk10 0.621 0.255 0.4115 0.133 0.8510 0.0591 0.8550 8.87 Ãƒï¿½ 10Ã¢ï¿½ï¿½50.85It is clear that the bound kMkx0 Ã¢ï¿½ï¿½ qk1 Ã¢ï¿½Â¤ ckkx0 Ã¢ï¿½ï¿½ qk1 is rather pessimistic (note 0.85 is the value1 Ã¢ï¿½ï¿½ m, and 0.85 turns out to be the second largest eigenvalue for M). One can show that in generalthe power method will converge asymptotically according to kMxk Ã¢ï¿½ï¿½ qk1 Ã¢ï¿½ï¿½ |ÃŽÂ»2|kx Ã¢ï¿½ï¿½ qk1, where ÃŽÂ»210 K. BRYAN AND T. LEISEis the second largest eigenvalue of M. Moreover, for M of the form M = (1 Ã¢ï¿½ï¿½ m)A + mS with Acolumn-stochastic and all Sij = 1/n it can be shown that |ÃŽÂ»2| Ã¢ï¿½Â¤ 1 Ã¢ï¿½ï¿½m (see, e.g., , Theorem 5.10).As a result, the power method will converge much more rapidly than indicated by ckkx0 Ã¢ï¿½ï¿½ qk1.Nonetheless, the value of c in Proposition 4 provides a very simple bound on the convergence of thepower method here. It is easy to see that since all entries of M are at least m/n, we will alwayshave c Ã¢ï¿½Â¤ 1 Ã¢ï¿½ï¿½ 2m/n in Proposition 4.As a practical matter, note that the n Ãƒï¿½ n positive matrix M has no non-zero elements, so themultiplication Mv for v Ã¢ï¿½ï¿½ lRnwill typically take O(n2) multiplications and additions, a formidablecomputation if n = 8, 000, 000, 000. But equation (3.2) shows that if x is positive with kxk1 = 1 thenthe multiplication Mx is equivalent to (1 Ã¢ï¿½ï¿½ m)Ax + ms. This is a far more eÃ¯Â¬ï¿½cient computation,since A can be expected to contain mostly zeros (most web pages link to only a few other pages).WeÃ¢ï¿½ï¿½ve now proved our main theorem:Theorem 4.2. The matrix M deÃ¯Â¬ï¿½ned by (3.1) for a web with no dangling nodes will always be apositive column-stochastic matrix and so have a unique q with positive components such that Mq = qandPiqi = 1. The vector q may be computed as the limit of iterations xk = (1 Ã¢ï¿½ï¿½ m)AxkÃ¢ï¿½ï¿½1 + ms,where x0 is any initial vector with positive components and kx0k1 = 1.The eigenvector x deÃ¯Â¬ï¿½ned by equation (3.2) also has a probabilistic interpretation. Consider aweb-surfer on a web of n pages with no dangling nodes. The surfer begins at some web page (itdoesnÃ¢ï¿½ï¿½t matter where) and randomly moves from web page to web page according to the followingprocedure: If the surfer is currently at a page with r outgoing links, he either randomly chooses anyone of these links with uniform probability1Ã¢ï¿½ï¿½mr OR he jumps to any randomly selected page on theweb, each with probabilitymn(note that r1Ã¢ï¿½ï¿½mr +nmn = 1, so this accounts for everything he can do).The surfer repeats this page-hopping procedure ad inÃ¯Â¬ï¿½nitum. The component xj of the normalizedvector x in equation (3.2) is the fraction of time that the surfer spends, in the long run, on page jof the web. More important pages tend to be linked to by many other pages and so the surfer hitsthose most often.Exercise 14. For the web in Exercise 11, compute the values of kMkx0 Ã¢ï¿½ï¿½qk1 andkMkx0Ã¢ï¿½ï¿½qk1kMkÃ¢ï¿½ï¿½1x0Ã¢ï¿½ï¿½qk1for k = 1, 5, 10, 50, using an initial guess x0 not too close to the actual eigenvector q (so that youcan watch the convergence). Determine c = max1Ã¢ï¿½Â¤jÃ¢ï¿½Â¤n |1 Ã¢ï¿½ï¿½ 2 min1Ã¢ï¿½Â¤iÃ¢ï¿½Â¤n Mij | and the absolute valueof the second largest eigenvalue of M.Exercise 15. To see why the second largest eigenvalue plays a role in boundingkMkx0Ã¢ï¿½ï¿½qk1kMkÃ¢ï¿½ï¿½1x0Ã¢ï¿½ï¿½qk1,consider an n Ãƒï¿½ n positive column-stochastic matrix M that is diagonalizable. Let x0 be any vectorwith non-negative components that sum to one. Since M is diagonalizable, we can create a basisof eigenvectors {q, v1, . . . , vnÃ¢ï¿½ï¿½1}, where q is the steady state vector, and then write x0 = aq +PnÃ¢ï¿½ï¿½1k=1bkvk. Determine Mkx0, and then show that a = 1 and the sum of the components of eachvk must equal 0. Next apply Proposition 4 to prove that, except for the non-repeated eigenvalueÃŽÂ» = 1, the other eigenvalues are all strictly less than one in absolute value. Use this to evaluatelimkÃ¢ï¿½ï¿½Ã¢ï¿½ï¿½kMkx0Ã¢ï¿½ï¿½qk1kMkÃ¢ï¿½ï¿½1x0Ã¢ï¿½ï¿½qk1.Exercise 16. Consider the link matrixA =Ã¯Â£Â®Ã¯Â£Â°012120 0121120Ã¯Â£Â¹Ã¯Â£Â» .Show that M = (1 Ã¢ï¿½ï¿½ m)A + mS (all Sij = 1/3) is not diagonalizable for 0 Ã¢ï¿½Â¤ m < 1.Exercise 17. How should the value of m be chosen? How does this choice aÃ¯Â¬ï¿½ect the rankingsand the computation time?REFERENCES A. Berman and R. Plemmons, Nonnegative Matrices in the Mathematical Sciences, Academic Press, NewYork, 1979. M. W. Berry and M. Browne, Understanding Search Engines: Mathematical Modeling and Text Retrieval,Second Edition, SIAM, Philadelphia, 2005. M. Bianchini, M. Gori, and F. Scarselli, Inside PageRank, ACM Trans. Internet Tech., 5 (2005), pp. 92Ã¢ï¿½ï¿½128.THE \$25,000,000,000 EIGENVECTOR 11 S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine,http : //www Ã¢ï¿½ï¿½ db.stanford.edu/ Ã¢ï¿½Â¼ backrub/google.html (accessed August 1, 2005). A. Farahat, T. Lofaro, J. C. Miller, G. Rae, and L.A. Ward, Authority Rankings from HITS, PageRank,and SALSA: Existence, Uniqueness, and EÃ¯Â¬ï¿½ect of Initialization, SIAM J. Sci. Comput., 27 (2006), pp.1181-1201. A. Hill, Google Inside Out, Maximum PC, April 2004, pp. 44-48. S. Kamvar, T. Haveliwala, and G. Golub, Adaptive methods for the computation of PageRank, LinearAlgebra Appl., 386 (2004), pp. 51Ã¢ï¿½ï¿½65. A. N. Langville and C. D. Meyer, A survey of eigenvector methods of web information retrieval, SIAMReview, 47 (2005), pp. 135-161. A. N. Langville and C. D. Meyer, Deeper inside PageRank, Internet Math., 1 (2005), pp. 335Ã¢ï¿½ï¿½380. C. D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, Philadelphia, 2000. Cleve Moler, The worldÃ¢ï¿½ï¿½s largest matrix computation, http : //www.mathworks.com/company/newsletters/news notes/clevescorner/oct02 cleve.html (accessed August 1, 2005). Mostafa, J., Seeking better web searches, Sci. Amer., 292 (2005), pp. 66-73. Sara Robinson, The Ongoing search for eÃ¯Â¬ï¿½cient web search algorithms, SIAM News, 37 (Nov 2004). Ian Rogers, The Google Pagerank algorithm and how it works, http : //www.iprcom.com/papers/pagerank/(accessed August 1, 2005). W. J. Stewart, An Introduction to the Numerical Solution of Markov Chains, Princeton University Press,Princeton, 1994.`