Network: Bowling a Google at the Net

A search engine to cut the cackle in cyberspace? At last, it may be with us.

Stephen Pritchard
Monday 06 December 1999 00:02 GMT
Comments

A "google", or "googol", is not baby talk, but a mathematical term. It means 10 to the power of 100, a number so vast it is hard for the human brain to comprehend. As such, it is a handy metaphor for the vast size of the Internet.

Google is also the name for the latest, and some say best, Internet search engine, which works on mathematical principles. Conventional search engines such as Alta Vista and Excite rely on text-based searches using keywords.

Google is the product of research by two doctoral students at Stanford University, Sergei Brin and Larry Page. Brin was researching into data- mining and Page was working on the structure of the Internet and the importance of links. "We met early on and realised there was no better data to test our work on than the human knowledge that makes up the Web," Brin recalls.

Word about Google spread rapidly around the Stanford campus, and soon Brin and Page were persuaded to give up their PhDs and join the growing band of Silicon Valley start-ups. A year later the company is a fully fledged search site with financial backing of more than $25m and more than 60 people on the payroll. Brin was in the UK last week to launch Google's first tailor-made search service outside the US; it is now the default search engine on Virgin Net.

The first thing that strikes visitors to Google's native home page is its simplicity; all it offers is a field to enter a search term, and a couple of buttons. Google provides no news, weather, stock quotes, horoscopes or free e-mail.

This, according to Brin, is deliberate. "Google's focus is to help people find information and to navigate it," he says, pointing out that search is the second most used function of the Internet. E-mail is the first. "Users should quickly be able to find information... It should be a site they are in the habit of using."

Google is about to make one concession to clutter: the company will soon start accepting advertising banners on its main site. The commercial challenge for Google and indeed any other portal or search site is to turn visits into dollars. Google makes some of its money from teaming up with other portal companies, such as Netscape and Virgin Net; it is also developing specialist search engines for clients, including Red Hat, the Linux distributors.

One form of commerce Google will not contemplate, Brin maintains, is accepting payment to put a particular site at the top of its search results. "There is a lot of value in providing a search site that is objective," he feels, and objectivity comes from the methods Google uses to match searches to sites. Early Internet directories such as Yahoo! relied on human editors to produce lists of interesting websites. In the early days of the Web, Brin recalls using the What's New function on the Mosaic browser to keep abreast of new pages. That task is now simply beyond the human brain.

Instead, Google relies on a network of 2,000 computers to index the Internet. The index is on 80 terabytes of hard disk space. "Every web page influences the search," says Brin. "As the size of the index grows, Google provides better results."

Google's technique involves query-independent and query-dependent factors. The query-independent side ranks each website according to its significance - the main determinant being the number of other Web pages linked to a site.

The query-dependent part of the process reaches into each Web page and looks at its hypertext contents, including headings and pictures. Google believes this also protects the engine against website developers who build pages in a way that increases their sites' rankings in searches.

The danger is that, by concentrating on significant sites, the search engine will neglect smaller but no less interesting pages. But Brin maintains that Google's methods improve rather than hinder the chances of smaller sites featuring.

"If you are specific in your search, they should pop up," he says. "Throwing up random Web pages is very different from finding small sites."

Brin and Page are still in their late twenties, but moving from graduate work to managing a million-dollar company Brin describes as liberating. "I don't think the nature of research and starting an Internet company are that different. In both cases it is about trying to bring something new to the world." The biggest difference, Brin says, is that universities' resources are limited. As a commercial company, Google has access to resources that researchers can only dream of; he is already thinking about taking Google to the next stage.

Brin sees the day when Internet users can ask real language questions and receive real language replies. There are already sites that attempt to understand questions in natural language, but they produce search results as lists of web pages. "Search is an artificial intelligence problem. It requires artificial intelligence to do a good job. But between there and where we are now, there are a lot of good technologies that we can build."

Google: www.google.com

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in