Udi Manber: The First 10 Years on the Web

Udi Manber gave the first talk of this year’s Jon Postel Distinguished Lecture Series today at UCLA.

It seems fitting that I should have a link to Udi’s book on Amazon.com at the beginning of my review of his talk; he started working for Amazon just about a month ago.

While a handful of professors and grad students scrambled around trying to get the laptop to work correctly with the LCD projector, Udi spoke a bit about his personal history as the Web developed. He mentioned his contributions to the field, including suffix arrays (1989), agrep (1991), glimpse (1992), and even the web’s first screen scraper (1996).

What makes the web so fundamentally new and exciting

When Udi returned from a sabbatical in 1993, he was very excited about how the web was going to change everything. His colleagues cautioned him, “But there’s nothing new in the Web. We’ve done it all before. The web is just databases, networks and information retrieval all over again.” He acknowledged that his peers were correct in some respects, but scale is what makes the web fundamentally new: the sheer number of users, and the amount of content. He also related the importance of the ubiquity of the web with the advent of television:

TV didn’t invent storytelling
TV didn’t invent motion pictures
TV didn’t invent actors
It wasn’t even in color
But it’s in everyone’s home!

Because everything on the web is traceable, Udi feels that data available to websites also allows for companies to create a fundamentally different experience:

More data == better experience. For example, an Amazon.com product detail page shows not only the price of a product, but also related items based on what customers bought, editorial and user-generated reviews, and sometimes even scanned excerpts from a book.
Instant data == instant QA. Companies get instant feedback from users both in the form of emails and also what customers do and don’t click on or buy. Any problems with the software are noticed quickly, are solved faster, and a company is able to lose less money.
Flexible data == better business decisions. By running controlled experiments (Amazon calls them A/B tests) the company can decide whether a new feature should be placed on the left or right side of the page, or whether the color should be blue or green. Almost every new feature is first tested by showing it to a random sampling of the user base to see how they react to it. It’s really easy to see after some small amount of testing if something is going to make more money or improve the user experience.

Udi gave an example of a feature that the company tested. When you’re about to purchase a product, they look through all of your past purchases to see if you’ve already bought that item from Amazon.com. If you have, they pop up a big red warning telling you that you might be buying a duplicate item. There are some legit reasons why you’d want a duplicate; maybe you lost the item, or maybe it’s going to be a gift. But many times it turns out that people put a CD in their shopping cart simply because they forgot that they already own it. So Amazon developed this feature out and tested it out.

Sure enough, it decreased sales, because much of the time the consumer didn’t need a duplicate. But Amazon decided to adopt the feature anyways! Even though it meant less revenue in the short term, the better user experience by not having to return an item (hopefully) translates to increased customer loyalty and therefore more long-term revenue.

Search

Udi spent a bit of time talking about the importance of Search. He described what he sees as 4 generations of web search:

1992-1993: index data from selected sites (Harvest, archie)
1995: collect data from the entire web (Lycos, AltaVista, InfoSeek, Inktomi)
1998: it’s all about relevancy, stupid! (Google)
2001: it’s all about monetization, stupid! (Overture)
and the next generation of Web Search is yet to come

What is missing from Search today? Udi pointed out a bunch of problems waiting to be solved:

Understanding the query (these days we’re still treating search queries as strings of characters)
Understanding the users
Personalization (instead of today’s “democratic” search engines which show everyone the same results for a particular query, should we customize the search results based on what we know about the user?)
Helping the user with query refinement
Better visualization of search results (something better than pages and pages of text, but also something easy enough for people to understand)
Anti-SPAM (there are hundreds of companies in the Search Engine Optimization business who are essentially spamming Google to improve rankings of particular sites.)

E-Commerce

Udi prefaced his comments about e-commerce by pointing out that “business” and merchants are hated in almost all cultures, yet somehow commerce/trading started as early as 4000 BC. Why? Because the alternative for acquiring goods is war and that doesn’t scale too well.

He spoke a bit about the beginnings of Amazon.com (Jeff Bezos’ garage) and showed the audience a screen shot of what Amazon’s home page looked like in 1995 complete with LOTS OF TEXT IN SMALL CAPS. We’ve come a long way, baby.

Udi then moved on to discuss in broad terms some of the problems involved in order fulfillment. Deciding what products to ship from what distribution centers and what to order from publishers or distributors involves all sorts of combinatorics and traveling salesman problems. He gave a particular hairy example of a Stochastic Linear Program used to optimize shipping of an order of just 2 books. Most of these problems are exponential in complexity, and the site has got only 500 milliseconds to make an intelligent decision so it can tell the user how much shipping is going to cost for their order.

Udi was hoping to talk about Security, too, but he ran short on time. Instead he took some Q&A from the audience. Many questions had to do with specifics about Amazon’s business and development culture, which Udi couldn’t really answer because he’s only been there a month. When asked about what he would change about academia given his experience in both worlds, he said he wanted to see more of a focus on solving real problems. Too many toy problems are given to students just for the sake of learning. As a result, academics don’t often understand the problems of real users. To help remedy this, he would be interested in providing academic institutions some of Amazon’s real data to use for teaching algorithms and modeling.

Lastly, Udi announced that he would be available on Friday morning at UCLA to speak to students about jobs at Amazon.com. I’m guessing that he’s building up a kewl R&D team and wants a crop of freshly minted PhDs.