Google Press Day 2006 Live Blogging, Part Three

By Nathan Weinberg

Now up is Alan Eustace. Jason says his name is Alan “Next Slide” Eustace, since he apparently forgot his slide clicker.

Says Google has to be very fast at indexing the web, because at a slow speed, it would take 253 years to index 8 billion pages. Googlebot needs to be fast, but be aware that at full speed, it can take whole websites.

Google works like a book’s index. It lists the top pages per single word query, just like an index, then looks where those two intersect, and comes up with that as the best result.

Example:

If heart is listed in an index as being on pages 4, 9, 12 and 15, and attack is on 5, 9, and 14, then Google knows 9 is the best page.

Manually fixing results pages is impossible at their scale, so they never try it.

20-25% of Google’s queries are always brand new.

Says that expectations of search results are what’s important, and that algorithms need to be based on expectations, not mathematical models:

Shows that some of the older spamming techniques don’t work on Google, but might work on some of their competitors. Link trading is slightly better.

Search is easy at first, but the details are difficult and the big deal. Most changes Google makes helps a higher percentage of queries than it hurts, but still decreases the value of a significant percentage of queries.

Opens the floor to questions. Asked about video search, says problem is less available text, requires user-submitted metadata and popularity for rankings. So why doesn’t Google Video have tagging and comments!?

Asked what Google is trying to do to fix products that buckle and crash under their own popularity. He says all products share infrastructure and file systems. Google does not guarantee quality of service for products other than search, and there is an expectation that they will crash. Jeez, great attitude!

Does Google plan on using demographic data to improve searches? He answers that, for now it is better to start every query with a clean slate, but over time, it might prove useful to improve searches that way. In his example, knowing whether you are a fisherman or guitarist will make a huge difference if you search for bass. Of course, clustering would make a difference, but he doesn’t mention that.

The wifi has been broken since the event started. They may have just fixed it.

Onto a new post…

Posted:
May 10, 2006 by Nathan Weinberg in:
-->

Leave a Reply

Commenting? If there's a contest today, you might be entering to win something. Check it out.

- This blog has coComment integrated.