To Search, Ask

Library science will improve online search.

Jun 23, 2009

If you ask a librarian for a book about Mexico, the librarian will undoubtedly ask you to specify: Are you looking for a history book, a travel guide, or something else entirely? Today’s search engines could benefit from the same approach.

With most existing online systems, a user makes an information request in a couple of words, and the search engine returns a list of documents ranked by relevance. Search technologists are busily working on relevance-ranking algorithms and question-answering systems so that they can read as much as possible into a query without asking any more of the user. But information-retrieval researchers suggest that these approaches have reached a point of diminishing returns. A search engine cannot reliably surmise the user’s intent from a single query.

What we need is human-computer information retrieval (HCIR), a term coined by University of North Carolina professor Gary Marchionini. The HCIR approach advocates for tools that bring human intelligence and attention actively into the search process. Rather than guessing what users need, these tools provide users with opportunities to clarify and elaborate their intent. If the engine isn’t sure what users want, it just asks them. (For another approach to information retrieval, see “Search Me”.)

The HCIR approach evokes what librarians call reference interviews. Indeed, HCIR leans heavily on techniques from library science, such as faceted information retrieval. Adapted for Internet use over the past decade, faceted search extends keyword search by allowing users different ways to refine queries. A search for “Mexico” might offer refinements by topic (history, demographics), language (Spanish), date published, and so on. Not surprisingly, this approach is popular for online libraries, but it has also become a staple of online shopping; Home Depot’s website is an example. HCIR transforms search engines from black-box matching engines to conversational librarians. The core technical challenge is no longer ranking the results but, rather, summarizing and organizing them so that users can interact with them. HCIR offers users the transparency, control, and guidance to establish, elaborate, and resolve their information needs.

It’s fun to work on algorithms that guess users’ intentions, and the temptation to push the limits of purely technical solutions can be irresistible. But sometimes the best approach is the most obvious one. We may do well to follow a bit of advice passed on by Nobel laureate Richard Feynman in his book Surely You’re Joking, Mr. Feynman! When discussing with a bar mate how best to pick up women, Feynman recounts, that sage soul averred, “You just ask them.”

Daniel Tunkelang is cofounder of Endeca, an information and search company.