Taher Haveliwala - Bio

Early Years

I've been interested in the field of computers from an early age. My first computer was the Apple ][+. I think it's still lying around in my garage somewhere, kept as a memento of my childhood. I remember spending many hours as a child using the computer for both entertainment and education. I recall occasionally being so engrossed in some endeavor (playing a game or writing a computer program) that it would be difficult for my mother to pull me away to eat my dinner.


I took computer classes in my elementary school and high school; although those classes were interesting, it was not until I started my Bachelor's degree at U.C. Berkeley that I fully appreciated the scope and depth of the field of Computer Science. Upon graduating, I decided to continue my studies by obtaining my Master's degree and Ph.D. from Stanford University.


My research at Stanford mainly involved developing algorithms for Web search. The basic technology behind search engines has existed for many years (since the 1960's). The advent of the World Wide Web greatly changed the scale of the data, requiring much more sophisticated technology, but many of the key building blocks existed previously. A search engine allows a user to type in a query, which is a set of terms describing what the user is looking for. The search engine must then return a list of relevant web pages.

Often times people ask, "How is it possible for a search engine to search the entire web for my query in a fraction of a second?" The heart of a search engine is the index; that is what allows a search engine to answer your queries so quickly. There is a very simple analogy that everyone is familiar with. If I give you a book about India, and ask you to find me the passage describing Mumbai, would you flip through the book, page-by-page, looking for a paragraph discussing Mumbai? No, you would immediately turn to the end of the book, where there is an index, listing exactly the pages on which the word Mumbai appears. You then only need to concern yourself with those pages when looking for information about Mumbai. A search engine index works in exactly the same way; before you have ever issued a query, the search engine will crawl the Web, and build up an index that lists for each word, all of the web pages that contain that word. Of course, a book has only a few hundred pages, whereas the Web has billions of pages, making the problem much more complex in practice. But at a very high level, the analogy with finding information in a book holds. One of the critical challenges search engines face is that the user wants to see only a few (say ten) results; figuring out which ten results to display for the query Mumbai out of the millions of pages that discuss Mumbai is a very difficult problem, and is the target of substantial research and development.


Taher Haveliwala