Google's PageRank Algorithm and the Rasch Measurement Model

"PageRank" is a widely-acclaimed algorithm used for determining the ranking or ordering of web pages by web search engines (Langville & Meyer, 2006; Chung, 2008). It was originally developed at Stanford University by Larry Page and Sergey Brin, who later started Google, Inc (Page & Brin, 1998). The algorithm has become a highly celebrated mathematical tool for analyzing networks of all kinds, including biological and social networks (Chung, 2008). The algorithm was the topic of a featured talk at the 2008 Joint Mathematics Meetings (joint meeting of the Mathematical Association of America and The American Mathematical Society) in San Diego.

The PageRank algorithm has much in common with one particular algorithm for estimating the item parameters of the Rasch model - the eigenvector (EVM) algorithm described by Garner and Engelhard (2002, etc.). Both methods depend on pairwise comparisons. Both methods also use the eigenvector of a matrix derived from pairwise comparisons.

An Illustrative Data Set

 Persons
Items12345678910
11110111111
20101111011
31001101110
41001011010

We can represent this data matrix graphically as a directed multi-graph, shown here, with nodes representing items. An arrow (a directed edge) from node j to node i indicates that someone got item i correct and item j wrong. So, the four arrows between items 1 and 2 indicate that 3 people succeeded on item 1, but failed on item 2, and one person succeeded on item 2 but failed on item 1.

The Eigenvector Method (EVM)

The EVM begins with creation of a paired comparison matrix similar to that now implemented in RUMM2020. A paired comparison matrix is constructed. Each entry, bij, representing the number of people who got item i right and item j wrong divided by the number of people who got item i wrong and item j right. This is an estimate of the difference in difficulties between items i and j on an odds-scale. bii are set to 1.

We can determine the Rasch item difficulties from this matrix by computing the eigenvector associated with the maximum eigenvalue, exponentiating the eigenvector, and subtracting out their mean, so that their sum is zero. The Rasch item difficulties are: -1.018, -.103, .268, .853.

The PageRank Algorithm

The PageRank algorithm is designed to assign weights to web pages so that the pages may be ranked in order of popularity. The web network we'll represent is the one depicted in Figure 1. The web pages A1, A2, A3, and A4 are represented by the four circles and an arrow would represent the fact that one web page links to the other web page. For example, web page A2 has four links to web page A1.

Each web site has a "PageRank" value associated with it. Let a1, a2, a3, a4 be those values. The PageRank value is conceptualized as the sum:

Nj is the total number of links out of Aj to any other web pages (6 for A2) and Mij is the number of links out of Aj to Ai (3 from A2 to A1) Thus each page in the system contributes a portion of their value to the page they reference. Then, for our illustrative dataset,

A convenient way to solve these simultaneous equations is by matrix algebra and eigenvalues, adding a constraint to make the equations identifiable and the unknowns non-zero. A solution is a1 = -.65, a2=-.49, a3=-.45, and a4=-.36. Thus, the easiest-to-link web page is A1, followed by A2, then A3, then A4.

Observations and Possibilities

There is much excitement over PageRank. Chung points out that PageRank is a "well defined operator on any given graph" (Chung, 2008), and describes the relationship between PageRank and random walks, spectral graph theory, spectral geometry, combinatorics, probability, and linear algebra. Langville and Meyer (2006) state that "models exploiting the Web's hyperlink structure are called link analysis models. The impact that these link analysis models have had is truly awesome."

The plot shows the relationship between the Rasch item difficulties and the PageRank weights for our illustrative dataset. This suggests that the relationship is ogival, but close to linear for practical purposes. Notice that the coefficients of the PageRank equations are in the range 0-1, representing empirical probabilities. Rasch theory suggests that expressing those coefficients as log-odds, loge(coefficient/(1-coefficent)), would be an immediate improvement to the PageRank equations. Here is a fruitful area of research for a graduate student looking for a dissertation topic.

Tools being developed for PageRank analysis and validation may well prove productive for Rasch measurement. These include techniques for accommodating missing and extreme comparisons, and also for identifying "cut vertices", web pages whose omission causes disconnected subsets of web pages in the analysis. These would correspond to elements (items, persons, raters) on which the linkage in judging plans, adaptive-testing or equating analysis depends.

Mary Garner, Kennesaw State University, Georgia

Chung, Fan. (2008). The Mathematics of PageRank. Presentation at the Joint Mathematics Meetings. San Diego, CA. www.math.ucsd.edu/~fan/

Garner, M., & Engelhard, G. (2002). An Eigenvector Method For Estimating Item Parameters Of The Dichotomous And Polytomous Rasch models. Journal of Applied Measurement, 3(2), 107-128.

Langville, A.N., & Meyer, C.D. (2006). Google's PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press.

Page, L., & Brin, S. (1998). The Anatomy of a Large Scale Hypertextual Search Engine. Proceedings of the Seventh International Web Conference. infolab.stanford.edu/pub/papers/google.pdf

Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Danish Yearbook of Philosophy, 14, 58-94. www.rasch.org/memo18.htm



Gerner M. (2009) Google's PageRank Algorithm and the Rasch Measurement Model, Rasch Measurement Transactions, 2009, 23:2, 1201-2



Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

www.rasch.org welcomes your comments:

Your email address (if you want us to reply):

 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
May 17 - June 21, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 12 - 14, 2024, Wed.-Fri. 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden http://www.hkr.se/samc2024
June 21 - July 19, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 5 - Aug. 6, 2024, Fri.-Fri. 2024 Inaugural Conference of the Society for the Study of Measurement (Berkeley, CA), Call for Proposals
Aug. 9 - Sept. 6, 2024, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 4 - Nov. 8, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt232a.htm

Website: www.rasch.org/rmt/contents.htm