PageRank and the Birth of Google, 1996-1998


Invented in 1996, the PageRank citation algorithm was the basis of the search engine that launched Google’s founding in 1998. PageRank interpreted hyperlinks as referrals, posited that a high-quality page should have high-quality pages providing referrals, and recursively produced useful ranking scores for all indexed pages. This recursive quality evaluation technique became widely adopted by other search engines, as well as social networks, peer-to-peer systems, and numerous other services.

Milestone Proposal Webpage

How it all started

Larry Page enrolled in Stanford's computer science PhD program in 1995, where he met fellow CS student Sergey Brin. They both worked on digital book projects and envisioned the day when Internet users would be able to search content in books. Page developed an early search engine called BackRub as a research project, and he stayed in touch with Brin as he began looking into the behavior of linking on the World Wide Web. Together they developed the "PageRank" algorithm for BackRub, which calculates the relevance of a web page to a user's query.

Page and Brin together founded Google on September 4, 1998 in Menlo Park, CA. They renamed their search engine to Google as a play on "googol," the term for the huge number of 1 followed by 100 zeros. The name reflects their mission to "organize the world's information and make it universally accessible and useful."

Blog single
The 'logo' used for the BackRub search tool at Stanford

The PageRank Algorithm

The PageRank algorithm greatly improved the state of the art in the ranking of web research results. Prior to PageRank, the main signal used for web search ranking was the lexical similarity between the words contained in a query and the words contained in a web page. Using lexical similarity as the only ranking signal is appropriate for traditional textual documents (e.g., books), but hypertext documents such as web pages also contain hyperlinks connecting these documents. By linking to another web page, the author of the web page containing the hyperlink implicitly signals that the linked web page is notable.

Blog single
The original logo when Google was launched in 1998

PageRank interprets hyperlinks from one web page to another as a positive signal, and assigns higher scores to pages that are referred (linked to) by many other pages, giving higher weight to highly-scored referrers. PageRank can be viewed as a query-independent measure of web page quality. Combining query-independent PageRank scores with lexical similarity scores between queries and web pages results in a ranking score that is superior to purely lexical scores.

PageRank is widely credited as the feature that differentiated Google from pre-existing search engines. Today, PageRank is used in a host of applications, ranging from search engines to social networks to peer-to-peer systems.

Blog single
Larry Page and Sergey Brin in their Menlo Park Garage office