Posted By

CKOink on 04/09/11


Tagged


Versions (?)

HTML4


 / Published in: HTML
 

  1. THE $25,000,000,000
  2. � EIGENVECTOR
  3. THE LINEAR ALGEBRA BEHIND GOOGLE
  4. KURT BRYAN� AND TANYA LEISE�
  5. Abstract. Google�s success derives in large part from its PageRank algorithm, which ranks the importance
  6. of webpages according to an eigenvector of a weighted link matrix. Analysis of the PageRank formula provides a
  7. wonderful applied topic for a linear algebra course. Instructors may assign this article as a project to more advanced
  8. students, or spend one or two lectures presenting the material with assigned homework from the exercises. This
  9. material also complements the discussion of Markov chains in matrix algebra. Maple and Mathematica �les supporting
  10. this material can be found at www.rose-hulman.edu/�bryan.
  11. Key words. linear algebra, PageRank, eigenvector, stochastic matrix
  12. AMS subject classi�cations. 15-01, 15A18, 15A51
  13. 1. Introduction. When Google went online in the late 1990�s, one thing that set it apart
  14. from other search engines was that its search result listings always seemed deliver the �good stu��
  15. up front. With other search engines you often had to wade through screen after screen of links
  16. to irrelevant web pages that just happened to match the search text. Part of the magic behind
  17. Google is its PageRank algorithm, which quantitatively rates the importance of each page on the
  18. web, allowing Google to rank the pages and thereby present to the user the more important (and
  19. typically most relevant and helpful) pages �rst.
  20. Understanding how to calculate PageRank is essential for anyone designing a web page that they
  21. want people to access frequently, since getting listed �rst in a Google search leads to many people
  22. looking at your page. Indeed, due to Google�s prominence as a search engine, its ranking system
  23. has had a deep in�uence on the development and structure of the internet, and on what kinds of
  24. information and services get accessed most frequently. Our goal in this paper is to explain one of
  25. the core ideas behind how Google calculates web page rankings. This turns out to be a delightful
  26. application of standard linear algebra.
  27. Search engines such as Google have to do three basic things:
  28. 1. Crawl the web and locate all web pages with public access.
  29. 2. Index the data from step 1, so that it can be searched e�ciently for relevant keywords or
  30. phrases.
  31. 3. Rate the importance of each page in the database, so that when a user does a search and
  32. the subset of pages in the database with the desired information has been found, the more
  33. important pages can be presented �rst.
  34. This paper will focus on step 3. In an interconnected web of pages, how can one meaningfully de�ne
  35. and quantify the �importance� of any given page?
  36. The rated importance of web pages is not the only factor in how links are presented, but it is a
  37. signi�cant one. There are also successful ranking algorithms other than PageRank. The interested
  38. reader will �nd a wealth of information about ranking algorithms and search engines, and we list
  39. just a few references for getting started (see the extensive bibliography in [9], for example, for a
  40. more complete list). For a brief overview of how Google handles the entire process see [6], and for
  41. an in-depth treatment of PageRank see [3] and a companion article [9]. Another article with good
  42. concrete examples is [5]. For more background on PageRank and explanations of essential principles
  43. of web design to maximize a website�s PageRank, go to the websites [4, 11, 14]. To �nd out more
  44. about search engine principles in general and other ranking algorithms, see [2] and [8]. Finally, for
  45. an account of some newer approaches to searching the web, see [12] and [13].
  46. 2. Developing a formula to rank pages.
  47. �THE APPROXIMATE MARKET VALUE OF GOOGLE WHEN THE COMPANY WENT PUBLIC IN 2004.
  48. �Department of Mathematics, Rose-Hulman Institute of Technology, Terre Haute, IN 47803; email:
  49. [email protected]; phone: 812) 877-8485; fax: (812)877-8883.
  50. �Mathematics and Computer Science Department, Amherst College, Amherst, MA 01002; email:
  51. [email protected]; phone: (413)542-5411; fax: (413)542-2550.
  52. 12 K. BRYAN AND T. LEISE
  53. 2 4
  54. 1 3
  55. 2 4
  56. 1 3
  57. Fig. 2.1. An example of a web with only four pages. An arrow from page A to page B indicates a link from page
  58. A to page B.
  59. 2.1. The basic idea. In what follows we will use the phrase �importance score� or just �score�
  60. for any quantitative rating of a web page�s importance. The importance score for any web page will
  61. always be a non-negative real number. A core idea in assigning a score to any given web page is
  62. that the page�s score is derived from the links made to that page from other web pages. The links
  63. to a given page are called the backlinks for that page. The web thus becomes a democracy where
  64. pages vote for the importance of other pages by linking to them.
  65. Suppose the web of interest contains n pages, each page indexed by an integer k, 1 � k � n. A
  66. typical example is illustrated in Figure 2.1, in which an arrow from page A to page B indicates a
  67. link from page A to page B. Such a web is an example of a directed graph.
  68. 1 We�ll use xk to denote
  69. the importance score of page k in the web. The xk is non-negative and xj > xk indicates that page j
  70. is more important than page k (so xj = 0 indicates page j has the least possible importance score).
  71. A very simple approach is to take xk as the number of backlinks for page k. In the example in
  72. Figure 2.1, we have x1 = 2, x2 = 1, x3 = 3, and x4 = 2, so that page 3 would be the most important,
  73. pages 1 and 4 tie for second, and page 2 is least important. A link to page k becomes a vote for
  74. page k�s importance.
  75. This approach ignores an important feature one would expect a ranking algorithm to have,
  76. namely, that a link to page k from an important page should boost page k�s importance score more
  77. than a link from an unimportant page. For example, a link to your homepage directly from Yahoo
  78. ought to boost your page�s score much more than a link from, say, www.kurtbryan.com (no relation
  79. to the author). In the web of Figure 2.1, pages 1 and 4 both have two backlinks: each links to
  80. the other, but page 1�s second backlink is from the seemingly important page 3, while page 4�s
  81. second backlink is from the relatively unimportant page 1. As such, perhaps we should rate page
  82. 1�s importance higher than that of page 4.
  83. As a �rst attempt at incorporating this idea let�s compute the score of page j as the sum of the
  84. scores of all pages linking to page j. For example, consider the web of Figure 2.1. The score of page
  85. 1 would be determined by the relation x1 = x3 + x4. Since x3 and x4 will depend on x1 this scheme
  86. seems strangely self-referential, but it is the approach we will use, with one more modi�cation. Just
  87. as in elections, we don�t want a single individual to gain in�uence merely by casting multiple votes.
  88. In the same vein, we seek a scheme in which a web page doesn�t gain extra in�uence simply by
  89. linking to lots of other pages. If page j contains nj links, one of which links to page k, then we will
  90. boost page k�s score by xj/nj , rather than by xj . In this scheme each web page gets a total of one
  91. vote, weighted by that web page�s score, that is evenly divided up among all of its outgoing links. To
  92. quantify this for a web of n pages, let Lk � {1, 2, . . . , n} denote the set of pages with a link to page
  93. k, that is, Lk is the set of page k�s backlinks. For each k we require
  94. xk =
  95. X
  96. j�Lk
  97. xj
  98. nj
  99. , (2.1)
  100. where nj is the number of outgoing links from page j (which must be positive since if j � Lk then
  101. 1A graph consists of a set of vertices (in this context, the web pages) and a set of edges. Each edge joins a pair
  102. of vertices. The graph is undirected if the edges have no direction. The graph is directed if each edge (in the web
  103. context, the links) has a direction, that is, a starting and ending vertex.THE $25,000,000,000 EIGENVECTOR 3
  104. 5
  105. 2 4
  106. 1 3
  107. 5
  108. 2 4
  109. 1 3
  110. Fig. 2.2. A web of �ve pages, consisting of two disconnected �subwebs� W1 (pages 1 and 2) and W2 (pages 3,
  111. 4, 5).
  112. page j links to at least page k!). We will assume that a link from a page to itself will not be counted.
  113. In this �democracy of the web� you don�t get to vote for yourself !
  114. Let�s apply this approach to the four-page web of Figure 2.1. For page 1 we have x1 = x3/1 +
  115. x4/2, since pages 3 and 4 are backlinks for page 1 and page 3 contains only one link, while page 4
  116. contains two links (splitting its vote in half ). Similarly, x2 = x1/3, x3 = x1/3 + x2/2 + x4/2, and
  117. x4 = x1/3 + x2/2. These linear equations can be written Ax = x, where x = [x1 x2 x3 x4]
  118. T
  119. and
  120. A =
  121. 
  122. 
  123. 
  124. 0 0 1
  125. 1
  126. 2
  127. 1
  128. 3
  129. 0 0 0
  130. 1
  131. 3
  132. 1
  133. 2
  134. 0
  135. 1
  136. 2
  137. 1
  138. 3
  139. 1
  140. 2
  141. 0 0
  142. 
  143. 
  144. 
  145. (2.2) .
  146. This transforms the web ranking problem into the �standard� problem of �nding an eigenvector
  147. for a square matrix! (Recall that the eigenvalues λ and eigenvectors x of a matrix A satisfy the
  148. equation Ax = λx, x =6 0 by de�nition.) We thus seek an eigenvector x with eigenvalue 1 for the
  149. matrix A. We will refer to A as the �link matrix� for the given web.
  150. It turns out that the link matrix A in equation (2.2) does indeed have eigenvectors with eigenvalue 1, namely, all multiples of the vector [12 4 9 6]
  151. T
  152. (recall that any non-zero multiple of an
  153. eigenvector is again an eigenvector). Let�s agree to scale these �importance score eigenvectors�
  154. so that the components sum to 1. In this case we obtain x1 =
  155. 12
  156. 31 � 0.387, x2 =
  157. 4
  158. 31 � 0.129,
  159. x3 =
  160. 9
  161. 31 � 0.290, and x4 =
  162. 6
  163. 31 � 0.194. Note that this ranking di�ers from that generated by simply
  164. counting backlinks. It might seem surprising that page 3, linked to by all other pages, is not the
  165. most important. To understand this, note that page 3 links only to page 1 and so casts its entire
  166. vote for page 1. This, with the vote of page 2, results in page 1 getting the highest importance score.
  167. More generally, the matrix A for any web must have 1 as an eigenvalue if the web in question
  168. has no dangling nodes (pages with no outgoing links). To see this, �rst note that for a general web
  169. of n pages formula (2.1) gives rise to a matrix A with Aij = 1/nj if page j links to page i, Aij = 0
  170. otherwise. The jth column of A then contains nj non-zero entries, each equal to 1/nj , and the
  171. column thus sums to 1. This motivates the following de�nition, used in the study of Markov chains:
  172. Definition 2.1. A square matrix is called a column-stochastic matrix if all of its entries
  173. are nonnegative and the entries in each column sum to one.
  174. The matrix A for a web with no dangling nodes is column-stochastic. We now prove
  175. Proposition 1. Every column-stochastic matrix has 1 as an eigenvalue. Proof. Let A be an
  176. n�n column-stochastic matrix and let e denote an n dimensional column vector with all entries equal
  177. to 1. Recall that A and its transpose AT
  178. have the same eigenvalues. Since A is column-stochastic
  179. it is easy to see that AT
  180. e = e, so that 1 is an eigenvalue for AT
  181. and hence for A. 
  182. In what follows we use V1(A) to denote the eigenspace for eigenvalue 1 of a column-stochastic
  183. matrix A.
  184. 2.2. Shortcomings. Several di�culties arise with using formula (2.1) to rank websites. In this
  185. section we discuss two issues: webs with non-unique rankings and webs with dangling nodes.
  186. 2.2.1. Non-Unique Rankings. For our rankings it is desirable that the dimension of V1(A)
  187. equal one, so that there is a unique eigenvector x with
  188. P
  189. i
  190. xi = 1 that we can use for importance
  191. scores. This is true in the web of Figure 2.1 and more generally is always true for the special case of4 K. BRYAN AND T. LEISE
  192. a strongly connected web (that is, you can get from any page to any other page in a �nite number
  193. of steps); see Exercise 10 below.
  194. Unfortunately, it is not always true that the link matrix A will yield a unique ranking for all
  195. webs. Consider the web in Figure 2.2, for which the link matrix is
  196. A =
  197. 
  198. 
  199. 
  200. 0 1 0 0 0
  201. 1 0 0 0 0
  202. 0 0 0 1
  203. 1
  204. 2
  205. 0 0 1 0
  206. 1
  207. 2
  208. 0 0 0 0 0
  209. 
  210. 
  211. 
  212. .
  213. We �nd here that V1(A) is two-dimensional; one possible pair of basis vectors is x = [1/2, 1/2, 0, 0, 0]
  214. T
  215. and y = [0, 0, 1/2, 1/2, 0]
  216. T
  217. But note that any linear combination of these two vectors yields another .
  218. vector in V1(A), e.g.,
  219. 3
  220. 4
  221. x +
  222. 1
  223. 4
  224. y = [3/8, 3/8, 1/8, 1/8, 0]
  225. T
  226. It is not clear which, if any, of these .
  227. eigenvectors we should use for the rankings!
  228. It is no coincidence that for the web of Figure 2.2 we �nd that dim(V1(A)) > 1. It is a
  229. consequence of the fact that if a web W, considered as an undirected graph (ignoring which direction
  230. each arrows points), consists of r disconnected subwebs W1, . . . , Wr, then dim(V1(A)) � r, and hence
  231. there is no unique importance score vector x � V1(A) with
  232. P
  233. i
  234. xi = 1. This makes intuitive sense: if
  235. a web W consists of r disconnected subwebs W1, . . . , Wr then one would expect di�culty in �nding
  236. a common reference frame for comparing the scores of pages in one subweb with those in another
  237. subweb.
  238. Indeed, it is not hard to see why a web W consisting of r disconnected subwebs forces dim(V1(A)) �
  239. r. Suppose a web W has n pages and r component subwebs W1, . . . , Wr. Let ni denote the number
  240. of pages in Wi
  241. Index the pages in W1 with indices 1 through n1, the pages in W2 with indices .
  242. n1 + 1 through n1 + n2, the pages in W3 with n1 + n2 + 1 through n1 + n2 + n3, etc. In general,
  243. let Ni =
  244. Pi
  245. j=1
  246. nj for i � 1, with N0 = 0, so Wi contains pages Ni�1 + 1 through Ni
  247. ,For example .
  248. in the web of Figure 2 we can take N1 = 2 and N2 = 5, so W1 contains pages 1 and 2, W2 contains
  249. pages 3, 4, and 5. The web in Figure 2.2 is a particular example of the general case, in which the
  250. matrix A assumes a block diagonal structure
  251. A =
  252. 
  253. 
  254. 
  255. A1 0 . . . 0
  256. 0 A2 0 0
  257. 0
  258. .
  259. .
  260. .
  261. .
  262. .
  263. 0 .
  264. 0 0 0 Ar
  265. 
  266. 
  267. 
  268. ,
  269. where Ai denotes the link matrix for Wi
  270. .In fact, Wi can be considered as a web in its own right .
  271. Each ni � ni matrix Ai
  272. is column-stochastic, and hence possesses some eigenvector v
  273. i
  274. � lR
  275. ni with
  276. eigenvector 1. For each i between 1 and r construct a vector wi
  277. � lR
  278. n
  279. which has 0 components for
  280. all elements corresponding to blocks other than block i. For example,
  281. w1
  282. =
  283. 
  284. 
  285. 
  286. v
  287. 1
  288. 0
  289. 0
  290. .
  291. .
  292. .
  293. 0
  294. 
  295. 
  296. 
  297. , w2
  298. =
  299. 
  300. 
  301. 
  302. 0
  303. v
  304. 2
  305. 0
  306. .
  307. .
  308. .
  309. 0
  310. 
  311. 
  312. 
  313. . . . ,
  314. Then it is easy to see that the vectors wi
  315. , 1 � i � r, are linearly independent eigenvectors for ATHE $25,000,000,000 EIGENVECTOR 5
  316. with eigenvalue 1 because
  317. Awi
  318. = A
  319. 
  320. 
  321. 
  322. 0
  323. .
  324. .
  325. .
  326. 0
  327. v
  328. i
  329. 0
  330. .
  331. .
  332. .
  333. 0
  334. 
  335. 
  336. 
  337. = wi
  338. .
  339. Thus V1(A) has dimension at least r.
  340. 2.2.2. Dangling Nodes. Another di�culty may arise when using the matrix A to generate
  341. rankings. A web with dangling nodes produces a matrix A which contains one or more columns of
  342. all zeros. In this case A is column-substochastic, that is, the column sums of A are all less than or
  343. equal to one. Such a matrix must have all eigenvalues less than or equal to 1 in magnitude, but
  344. 1 need not actually be an eigenvalue for A. Nevertheless, the pages in a web with dangling nodes
  345. can still be ranked use a similar technique. The corresponding substochastic matrix must have a
  346. positive eigenvalue λ � 1 and a corresponding eigenvector x with non-negative entries (called the
  347. Perron eigenvector) that can be used to rank the web pages. See Exercise 4 below. We will not
  348. further consider the problem of dangling nodes here, however.
  349. Exercise 1. Suppose the people who own page 3 in the web of Figure 1 are infuriated by the
  350. fact that its importance score, computed using formula (2.1), is lower than the score of page 1. In
  351. an attempt to boost page 3�s score, they create a page 5 that links to page 3; page 3 also links to page
  352. 5. Does this boost page 3�s score above that of page 1?
  353. Exercise 2. Construct a web consisting of three or more subwebs and verify that dim(V1(A))
  354. equals (or exceeds) the number of the components in the web.
  355. Exercise 3. Add a link from page 5 to page 1 in the web of Figure 2. The resulting web,
  356. considered as an undirected graph, is connected. What is the dimension of V1(A)?
  357. Exercise 4. In the web of Figure 2.1, remove the link from page 3 to page 1. In the resulting
  358. web page 3 is now a dangling node. Set up the corresponding substochastic matrix and �nd its largest
  359. positive (Perron) eigenvalue. Find a non-negative Perron eigenvector for this eigenvalue, and scale
  360. the vector so that components sum to one. Does the resulting ranking seem reasonable?
  361. Exercise 5. Prove that in any web the importance score of a page with no backlinks is zero.
  362. Exercise 6. Implicit in our analysis up to this point is the assertion that the manner in which
  363. the pages of a web W are indexed has no e�ect on the importance score assigned to any given page.
  364. Prove this, as follows: Let W contains n pages, each page assigned an index 1 through n, and let
  365. A be the resulting link matrix. Suppose we then transpose the indices of pages i and j (so page i is
  366. now page j and vice-versa). Let
  367. �A be the link matrix for the relabelled web.
  368. � Argue that
  369. �A = PAP, where P is the elementary matrix obtained by transposing rows i and
  370. j of the n � n identity matrix. Note that the operation A � PA has the e�ect of swapping
  371. rows i and j of A, while A � AP swaps columns i and j. Also, P2 = I, the identity
  372. matrix.
  373. � Suppose that x is an eigenvector for A, so Ax = λx for some λ. Show that y = Px is an
  374. eigenvector for
  375. �A with eigenvalue λ.
  376. � Explain why this shows that transposing the indices of any two pages leaves the importance
  377. scores unchanged, and use this result to argue that any permutation of the page indices leaves
  378. the importance scores unchanged.
  379. 3. A remedy for dim(V1(A)) > 1. An enormous amount of computing resources are needed
  380. to determine an eigenvector for the link matrix corresponding to a web containing billions of pages.
  381. It is thus important to know that our algorithm will yield a unique set of sensible web rankings.
  382. The analysis above shows that our �rst attempt to rank web pages leads to di�culties if the web
  383. isn�t connected. And the worldwide web, treated as an undirected graph, contains many disjoint
  384. components; see [9] for some interesting statistics concerning the structure of the web.6 K. BRYAN AND T. LEISE
  385. Below we present and analyze a modi�cation of the above method that is guaranteed to overcome
  386. this shortcoming. The analysis that follows is basically a special case of the Perron-Frobenius
  387. theorem, and we only prove what we need for this application. For a full statement and proof of the
  388. Perron-Frobenius theorem, see chapter 8 in [10].
  389. 3.1. A modi�cation to the link matrix A. For an n page web with no dangling nodes
  390. we can generate unambiguous importance scores as follows, including cases of web with multiple
  391. subwebs.
  392. Let S denote an n � n matrix with all entries 1/n. The matrix S is column-stochastic, and it is
  393. easy to check that V1(S) is one-dimensional. We will replace the matrix A with the matrix
  394. M = (1 � m)A + mS, (3.1)
  395. where 0 � m � 1. M is a weighted average of A and S. The value of m originally used by Google
  396. is reportedly 0.15 [9, 11]. For any m � [0, 1] the matrix M is column-stochastic and we show below
  397. that V1(M) is always one-dimensional if m � (0, 1]. Thus M can be used to compute unambiguous
  398. importance scores. In the case when m = 0 we have the original problem, for then M = A. At the
  399. other extreme is m = 1, yielding M = S. This is the ultimately egalitarian case: the only normalized
  400. eigenvector x with eigenvalue 1 has xi = 1/n for all i and all web pages are rated equally important.
  401. Using M in place of A gives a web page with no backlinks (a dangling node) the importance
  402. score of m/n (Exercise 9), and the matrix M is substochastic for any m < 1 since the matrix A is
  403. substochastic. Therefore the modi�ed formula yields nonzero importance scores for dangling links
  404. (if m > 0) but does not resolve the issue of dangling nodes. In the remainder of this article, we only
  405. consider webs with no dangling nodes.
  406. The equation x = Mx can also be cast as
  407. x = (1 � m)Ax + ms, (3.2)
  408. where s is a column vector with all entries 1/n. Note that Sx = s if
  409. P
  410. i
  411. xi = 1.
  412. We will prove below that V1(M) is always one-dimensional, but �rst let�s look at a couple of
  413. examples.
  414. Example 1: For the web of four pages in Figure 2.1 with matrix A given by (2.2), the new
  415. formula gives (with m = 0.15)
  416. M =
  417. 
  418. 
  419. 
  420. 0.0375 0.0375 0.8875 0.4625
  421. 0.3208¯3 0.0375 0.0375 0.0375
  422. 0.3208¯3 0.4625 0.0375 0.4625
  423. 0.3208¯3 0.4625 0.0375 0.0375
  424. 
  425. 
  426. 
  427. ,
  428. and yields importance scores x1 � 0.368, x2 � 0.142, x3 � 0.288, and x4 � 0.202. This yields the
  429. same ranking of pages as the earlier computation, but the scores are slightly di�erent.
  430. Example 2 shows more explicitly the advantages of using M in place of A.
  431. Example 2: As a second example, for the web of Figure 2.2 with m = 0.15 we obtain the matrix
  432. M =
  433. 
  434. 
  435. 
  436. 0.03 0.88 0.03 0.03 0.03
  437. 0.88 0.03 0.03 0.03 0.03
  438. 0.03 0.03 0.03 0.88 0.455
  439. 0.03 0.03 0.88 0.03 0.455
  440. 0.03 0.03 0.03 0.03 0.03
  441. 
  442. 
  443. 
  444. (3.3) .
  445. The space V1(M) is indeed one-dimensional, with normalized eigenvector components of x1 =
  446. 0.2, x2 = 0.2, x3 = 0.285, x4 = 0.285, and x5 = 0.03. The modi�cation, using M instead of A,
  447. allows us to compare pages in di�erent subwebs.
  448. Each entry Mij of M de�ned by equation (3.1) is strictly positive, which motivates the following
  449. de�nition.
  450. Definition 3.1. A matrix M is positive if Mij > 0 for all i and j. This is the key property
  451. that guarantees dim(V1(M)) = 1, which we prove in the next section.THE $25,000,000,000 EIGENVECTOR 7
  452. 3.2. Analysis of the matrix M. Note that Proposition 1 shows that V1(M) is nonempty
  453. since M is stochastic . The goal of this section is to show that V1(M) is in fact one-dimensional.
  454. This is a consequence of the following two propositions.
  455. Proposition 2. If M is positive and column-stochastic, then any eigenvector in V1(M) has all
  456. positive or all negative components. Proof. We use proof by contradiction. First note that in the
  457. standard triangle inequality |
  458. P
  459. i
  460. yi
  461. � |
  462. P
  463. i
  464. |yi
  465. | (with all yi real) the inequality is strict when the yi
  466. are of mixed sign. Suppose x � V1(M) contains elements of mixed sign. From x = Mx we have
  467. xi =
  468. Pn
  469. j=1 Mijxj and the summands Mijxj are of mixed sign (since Mij > 0). As a result we have
  470. a strict inequality
  471. |xi
  472. = |
  473. nX
  474. j=1
  475. Mijxj
  476. >
  477. nX
  478. j=1
  479. Mij |xj |. (3.4)
  480. Sum both sides of inequality (3.4) from i = 1 to i = n, and swap the i and j summations. Then use
  481. the fact that M is column-stochastic (
  482. P
  483. i Mij = 1 for all j) to �nd
  484. nX
  485. i=1
  486. |xi
  487. > |
  488. nX
  489. i=1
  490. nX
  491. j=1
  492. Mij |xj | =
  493. nX
  494. j=1
  495.  
  496. nX
  497. i=1
  498. Mij
  499. !
  500. |xj | =
  501. nX
  502. j=1
  503. |xj |,
  504. a contradiction. Hence x cannot contain both positive and negative elements. If xi � 0 for all i (and
  505. not all xi are zero) then xi > 0 follows immediately from xi =
  506. Pn
  507. j=1 Mijxj and Mij > 0. Similarly
  508. xi � 0 for all i implies that each xi < 0. 
  509. The following proposition will also be useful for analyzing dim(V1(M)):
  510. Proposition 3. Let v and w be linearly independent vectors in lR
  511. m
  512. , m � 2. Then, for some
  513. values of s and t that are not both zero, the vector x = sv + tw has both positive and negative
  514. components. Proof. Linear independence implies neither v nor w is zero. Let d =
  515. P
  516. i
  517. vi
  518. If d = 0 .
  519. then v must contain components of mixed sign, and taking s = 1 and t = 0 yields the conclusion.
  520. If d 6= 0 set s = �
  521. P
  522. i wi
  523. d P , t = 1, and x = sv + tw. Since v and w are independent x =6 0. However,
  524. i
  525. xi = 0. We conclude that x has both positive and negative components. 
  526. We can now prove that using M in place of A yields an unambiguous ranking for any web with
  527. no dangling nodes.
  528. Lemma 3.2. If M is positive and column-stochastic then V1(M) has dimension 1. Proof. We
  529. again use proof by contradiction. Suppose there are two linearly independent eigenvectors v and w
  530. in the subspace V1(M). For any real numbers s and t that are not both zero, the nonzero vector
  531. x = sv + tw must be in V1(M), and so have components that are all negative or all positive. But
  532. by Proposition 3, for some choice of s and t the vector x must contain components of mixed sign,
  533. a contradiction. We conclude that V1(M) cannot contain two linearly independent vectors, and so
  534. has dimension one. 
  535. Lemma 3.2 provides the �punchline� for our analysis of the ranking algorithm using the matrix
  536. M (for 0 < m < 1). The space V1(M) is one-dimensional, and moreover, the relevant eigenvectors
  537. have entirely positive or negative components. We are thus guaranteed the existence of a unique
  538. eigenvector x � V1(M) with positive components such that
  539. P
  540. i
  541. xi = 1.
  542. Exercise 7. Prove that if A is an n � n column-stochastic matrix and 0 � m � 1, then
  543. M = (1 � m)A + mS is also a column-stochastic matrix.
  544. Exercise 8. Show that the product of two column-stochastic matrices is also column-stochastic.
  545. Exercise 9. Show that a page with no backlinks is given importance score
  546. m
  547. n
  548. by formula (3.2).
  549. Exercise 10. Suppose that A is the link matrix for a strongly connected web of n pages
  550. (any page can be reached from any other page by following a �nite number of links). Show that
  551. dim(V1(A)) = 1 as follows. Let (Ak
  552. )ij denote the (i, j)-entry of Ak
  553. .
  554. � Note that page i can be reached from page j in one step if and only Aij > 0 (since Aij > 0
  555. means there�s a link from j to i!) Show that (A2
  556. )ij > 0 if and only if page i can be reached
  557. from page j in exactly two steps. Hint: (A2
  558. )ij =
  559. P
  560. k AikAkj ; all Aij are non-negative, so
  561. (A2
  562. )ij > 0 implies that for some k both Aik and Akj are positive.8 K. BRYAN AND T. LEISE
  563. � Show more generally that (Ap
  564. )ij > 0 if and only if page i can be reached from page j in
  565. EXACTLY p steps.
  566. � Argue that (I + A + A2 + · · · + Ap
  567. )ij > 0 if and only if page i can be reached from page j
  568. in p or fewer steps (note p = 0 is a legitimate choice�any page can be reached from itself
  569. in zero steps!)
  570. � Explain why I + A + A2 + · · · + An�1
  571. is a positive matrix if the web is strongly connected.
  572. � Use the last part (and Exercise 8) so show that B =
  573. 1
  574. n
  575. (I + A + A2 + · · · + An�1
  576. ) is positive
  577. and column-stochastic (and hence by Lemma 3.2, dim(V1(B)) = 1).
  578. � Show that if x � V1(A) then x � V1(B). Why does this imply that dim(V1(A)) = 1?
  579. Exercise 11. Consider again the web in Figure 2.1, with the addition of a page 5 that links to
  580. page 3, where page 3 also links to page 5. Calculate the new ranking by �nding the eigenvector of
  581. M (corresponding to λ = 1) that has positive components summing to one. Use m = 0.15.
  582. Exercise 12. Add a sixth page that links to every page of the web in the previous exercise, but
  583. to which no other page links. Rank the pages using A, then using M with m = 0.15, and compare
  584. the results.
  585. Exercise 13. Construct a web consisting of two or more subwebs and determine the ranking
  586. given by formula (3.1).
  587. At present the web contains at least eight billion pages�how does one compute an eigenvector
  588. for an eight billion by eight billion matrix? One reasonable approach is an iterative procedure called
  589. the power method (along with modi�cations) that we will now examine for the special case at hand.
  590. It is worth noting that there is much additional analysis one can do, and many improved methods
  591. for the computation of PageRank. The reference [7] provides a typical example and additional
  592. references.
  593. 4. Computing the Importance Score Eigenvector. The rough idea behind the power
  594. method
  595. 2
  596. for computing an eigenvector of a matrix M is this: One starts with a �typical� vector
  597. x0, then generates the sequence xk = Mxk�1 (so xk = Mk
  598. x0) and lets k approaches in�nity. The
  599. vector xk is, to good approximation, an eigenvector for the dominant (largest magnitude) eigenvalue
  600. of M. However, depending on the magnitude of this eigenvalue, the vector xk may also grow
  601. without bound or decay to the zero vector. One thus typically rescales at each iteration, say by
  602. computing xk =
  603. Mxk�1
  604. kMxk�1k
  605. , where k · k can be any vector norm. The method generally requires that
  606. the corresponding eigenspace be one-dimensional, a condition that is satis�ed in the case when M
  607. is de�ned by equation (3.1).
  608. To use the power method on the matrices M that arise from the web ranking problem we would
  609. generally need to know that any other eigenvalues λ of M satisfy |λ| < 1. This assures that the
  610. power method will converge to the eigenvector we want. Actually, the following proposition provides
  611. what we need, with no reference to any other eigenvalues of M!
  612. Definition 4.1. The 1-norm of a vector v is kvk1 =
  613. P
  614. i
  615. |vi
  616. .|
  617. Proposition 4. Let M be a positive column-stochastic n � n matrix and let V denote the
  618. subspace of lR
  619. n
  620. consisting of vectors v such that
  621. P
  622. j
  623. vj = 0. Then Mv � V for any v � V , and
  624. kMvk1 � ckvk1
  625. for any v � V , where c = max1�j�n |1 � 2 min1�i�n Mij | < 1. Proof. To see that Mv � V is
  626. straightforward: Let w = Mv, so that wi =
  627. Pn
  628. j=1 Mijvj and
  629. nX
  630. i=1
  631. wi =
  632. nX
  633. i=1
  634. nX
  635. j=1
  636. Mijvj =
  637. nX
  638. j=1
  639. vj
  640.  
  641. nX
  642. i=1
  643. Mij
  644. !
  645. =
  646. nX
  647. j=1
  648. vj = 0.
  649. Hence w = Mv � V . To prove the bound in the proposition note that
  650. kwk1 =
  651. nX
  652. i=1
  653. eiwi =
  654. nX
  655. i=1
  656. ei
  657. 
  658. 
  659. nX
  660. j=1
  661. Mijvj
  662. 
  663.  ,
  664. 2
  665. See [15] for a general introduction to the power method and the use of spectral decomposition to �nd the rate of
  666. convergence of the vectors xk = Mk
  667. x0.THE $25,000,000,000 EIGENVECTOR 9
  668. where ei = sgn(wi). Note that the ei are not all of one sign, since
  669. P
  670. i wi = 0 (unless w � 0 in which
  671. case the bound clearly holds). Reverse the double sum to obtain
  672. kwk1 =
  673. nX
  674. j=1
  675. vj
  676.  
  677. nX
  678. i=1
  679. eiMij
  680. !
  681. =
  682. nX
  683. j=1
  684. ajvj , (4.1)
  685. where aj =
  686. Pn
  687. i=1
  688. eiMij . Since the ei are of mixed sign and
  689. P
  690. i Mij = 1 with 0 < Mij < 1, it is easy
  691. to see that
  692. �1 < �1 + 2 min
  693. 1�i�n
  694. Mij � aj � 1 � 2 min
  695. 1�i�n
  696. Mij < 1.
  697. We can thus bound
  698. |aj | � |1 � 2 min
  699. 1�i�n
  700. Mij | < 1.
  701. Let c = max1�j�n |1 � 2 min1�i�n Mij |. Observe that c < 1 and |aj | � c for all j. From equation
  702. (4.1) we have
  703. kwk1 =
  704. nX
  705. j=1
  706. ajvj =
  707. nX
  708. j=1
  709. ajvj
  710. �
  711. nX
  712. j=1
  713. |aj ||vj | � c
  714. nX
  715. j=1
  716. |vj | = ckvk1,
  717. which proves the proposition.
  718. Proposition 4 sets the stage for the following proposition.
  719. Proposition 5. Every positive column-stochastic matrix M has a unique vector q with positive
  720. components such that Mq = q with kqk1 = 1. The vector q can be computed as q = limk�� Mk
  721. x0
  722. for any initial guess x0 with positive components such that kx0k1 = 1. Proof. From Proposition
  723. 1 the matrix M has 1 as an eigenvalue and by Lemma 3.2 the subspace V1(M) is one-dimensional.
  724. Also, all non-zero vectors in V1(M) have entirely positive or negative components. It is clear that
  725. there is a unique vector q � V1(M) with positive components such that
  726. P
  727. i
  728. qi = 1.
  729. Let x0 be any vector in lR
  730. n
  731. with positive components such that kx0k1 = 1. We can write
  732. x0 = q + v where v � V (V as in Proposition 4). We �nd that Mk
  733. x0 = Mk
  734. q + Mk
  735. v = q + Mk
  736. v.
  737. As a result
  738. Mk
  739. x0 � q = Mk
  740. v. (4.2)
  741. A straightforward induction and Proposition 4 shows that kMk
  742. vk1 � c
  743. k
  744. kvk1 for 0 � c < 1 (c as in
  745. Proposition 4) and so limk�� kMk
  746. vk1 = 0. From equation (4.2) we conclude that limk�� Mk
  747. x0 =
  748. q. 
  749. Example: Let M be the matrix de�ned by equation (3.3) for the web of Figure 2.2. We take
  750. x0 = [0.24, 0.31, 0.08, 0.18, 0.19]
  751. T
  752. as an initial guess; recall that we had q = [0.2, 0.2, 0.285, 0.285, 0.03]
  753. T
  754. .
  755. The table below shows the value of kMk
  756. x0 � qk1 for several values of k, as well as the ratio
  757. kMk
  758. x0 � qk1/kMk�1
  759. x0 � qk1. Compare this ratio to c from Proposition 4, which in this case is
  760. 0.94.
  761. k kMk
  762. x0 � qk1
  763. kMk
  764. x0�qk1
  765. kMk�1x0�qk1
  766. 0 0.62
  767. 1 0.255 0.411
  768. 5 0.133 0.85
  769. 10 0.0591 0.85
  770. 50 8.87 � 10
  771. �5
  772. 0.85
  773. It is clear that the bound kMk
  774. x0 � qk1 � c
  775. k
  776. kx0 � qk1 is rather pessimistic (note 0.85 is the value
  777. 1 � m, and 0.85 turns out to be the second largest eigenvalue for M). One can show that in general
  778. the power method will converge asymptotically according to kMxk � qk1 � |λ2|kx � qk1, where λ210 K. BRYAN AND T. LEISE
  779. is the second largest eigenvalue of M. Moreover, for M of the form M = (1 � m)A + mS with A
  780. column-stochastic and all Sij = 1/n it can be shown that |λ2| � 1 �m (see, e.g., [1], Theorem 5.10).
  781. As a result, the power method will converge much more rapidly than indicated by c
  782. k
  783. kx0 � qk1.
  784. Nonetheless, the value of c in Proposition 4 provides a very simple bound on the convergence of the
  785. power method here. It is easy to see that since all entries of M are at least m/n, we will always
  786. have c � 1 � 2m/n in Proposition 4.
  787. As a practical matter, note that the n � n positive matrix M has no non-zero elements, so the
  788. multiplication Mv for v � lR
  789. n
  790. will typically take O(n
  791. 2
  792. ) multiplications and additions, a formidable
  793. computation if n = 8, 000, 000, 000. But equation (3.2) shows that if x is positive with kxk1 = 1 then
  794. the multiplication Mx is equivalent to (1 � m)Ax + ms. This is a far more e�cient computation,
  795. since A can be expected to contain mostly zeros (most web pages link to only a few other pages).
  796. We�ve now proved our main theorem:
  797. Theorem 4.2. The matrix M de�ned by (3.1) for a web with no dangling nodes will always be a
  798. positive column-stochastic matrix and so have a unique q with positive components such that Mq = q
  799. and
  800. P
  801. i
  802. qi = 1. The vector q may be computed as the limit of iterations xk = (1 � m)Axk�1 + ms,
  803. where x0 is any initial vector with positive components and kx0k1 = 1.
  804. The eigenvector x de�ned by equation (3.2) also has a probabilistic interpretation. Consider a
  805. web-surfer on a web of n pages with no dangling nodes. The surfer begins at some web page (it
  806. doesn�t matter where) and randomly moves from web page to web page according to the following
  807. procedure: If the surfer is currently at a page with r outgoing links, he either randomly chooses any
  808. one of these links with uniform probability
  809. 1�m
  810. r OR he jumps to any randomly selected page on the
  811. web, each with probability
  812. m
  813. n
  814. (note that r
  815. 1�m
  816. r +n
  817. m
  818. n = 1, so this accounts for everything he can do).
  819. The surfer repeats this page-hopping procedure ad in�nitum. The component xj of the normalized
  820. vector x in equation (3.2) is the fraction of time that the surfer spends, in the long run, on page j
  821. of the web. More important pages tend to be linked to by many other pages and so the surfer hits
  822. those most often.
  823. Exercise 14. For the web in Exercise 11, compute the values of kMk
  824. x0 �qk1 and
  825. kMk
  826. x0�qk1
  827. kMk�1x0�qk1
  828. for k = 1, 5, 10, 50, using an initial guess x0 not too close to the actual eigenvector q (so that you
  829. can watch the convergence). Determine c = max1�j�n |1 � 2 min1�i�n Mij | and the absolute value
  830. of the second largest eigenvalue of M.
  831. Exercise 15. To see why the second largest eigenvalue plays a role in bounding
  832. kMk
  833. x0�qk1
  834. kMk�1x0�qk1
  835. ,
  836. consider an n � n positive column-stochastic matrix M that is diagonalizable. Let x0 be any vector
  837. with non-negative components that sum to one. Since M is diagonalizable, we can create a basis
  838. of eigenvectors {q, v1, . . . , vn�1}, where q is the steady state vector, and then write x0 = aq +
  839. Pn�1
  840. k=1
  841. bkvk. Determine Mk
  842. x0, and then show that a = 1 and the sum of the components of each
  843. vk must equal 0. Next apply Proposition 4 to prove that, except for the non-repeated eigenvalue
  844. λ = 1, the other eigenvalues are all strictly less than one in absolute value. Use this to evaluate
  845. limk��
  846. kMk
  847. x0�qk1
  848. kMk�1x0�qk1
  849. .
  850. Exercise 16. Consider the link matrix
  851. A =
  852. 
  853. 
  854. 0
  855. 1
  856. 2
  857. 1
  858. 2
  859. 0 0
  860. 1
  861. 2
  862. 1
  863. 1
  864. 2
  865. 0
  866. 
  867.  .
  868. Show that M = (1 � m)A + mS (all Sij = 1/3) is not diagonalizable for 0 � m < 1.
  869. Exercise 17. How should the value of m be chosen? How does this choice a�ect the rankings
  870. and the computation time?
  871. REFERENCES
  872. [1] A. Berman and R. Plemmons, Nonnegative Matrices in the Mathematical Sciences, Academic Press, New
  873. York, 1979.
  874. [2] M. W. Berry and M. Browne, Understanding Search Engines: Mathematical Modeling and Text Retrieval,
  875. Second Edition, SIAM, Philadelphia, 2005.
  876. [3] M. Bianchini, M. Gori, and F. Scarselli, Inside PageRank, ACM Trans. Internet Tech., 5 (2005), pp. 92�128.THE $25,000,000,000 EIGENVECTOR 11
  877. [4] S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine,
  878. http : //www � db.stanford.edu/ � backrub/google.html (accessed August 1, 2005).
  879. [5] A. Farahat, T. Lofaro, J. C. Miller, G. Rae, and L.A. Ward, Authority Rankings from HITS, PageRank,
  880. and SALSA: Existence, Uniqueness, and E�ect of Initialization, SIAM J. Sci. Comput., 27 (2006), pp.
  881. 1181-1201.
  882. [6] A. Hill, Google Inside Out, Maximum PC, April 2004, pp. 44-48.
  883. [7] S. Kamvar, T. Haveliwala, and G. Golub, Adaptive methods for the computation of PageRank, Linear
  884. Algebra Appl., 386 (2004), pp. 51�65.
  885. [8] A. N. Langville and C. D. Meyer, A survey of eigenvector methods of web information retrieval, SIAM
  886. Review, 47 (2005), pp. 135-161.
  887. [9] A. N. Langville and C. D. Meyer, Deeper inside PageRank, Internet Math., 1 (2005), pp. 335�380.
  888. [10] C. D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM, Philadelphia, 2000.
  889. [11] Cleve Moler, The world�s largest matrix computation, http : //www.mathworks.com/company/newsletters/news notes
  890. /clevescorner/oct02 cleve.html (accessed August 1, 2005).
  891. [12] Mostafa, J., Seeking better web searches, Sci. Amer., 292 (2005), pp. 66-73.
  892. [13] Sara Robinson, The Ongoing search for e�cient web search algorithms, SIAM News, 37 (Nov 2004).
  893. [14] Ian Rogers, The Google Pagerank algorithm and how it works, http : //www.iprcom.com/papers/pagerank/
  894. (accessed August 1, 2005).
  895. [15] W. J. Stewart, An Introduction to the Numerical Solution of Markov Chains, Princeton University Press,
  896. Princeton, 1994.

Report this snippet  

You need to login to post a comment.