Looking for a book, CD, or movie recommendation? Type in the name of an author that you like at Gnooks.com and up pops a screen of other writers. But what makes the site different is that the authors don’t appear as a scrollable list. Instead, the name you provide sits in the middle of the browser window while the suggested names are sprinkled about, quivering and dancing as though trying to elbow each other out of the way to reach the center.
This is search visualization in action. The closer another writer is to your choice, the more likely the system thinks that you will also enjoy that author’s work. Gnooks and other systems are applying data visualization and relationship analysis techniques to help people get a different view of what’s on the Web. Rather than deliver search results as a long roster of links, graphical searches show how different categories and types of information relate to each other. The hard part is finding a way of presenting the information without requiring the user to get a degree in how to use the interface.
Visualization techniques have existed for some time on the Web. Sites have long allowed users to do such things as click a map region to get all the sales representatives or company locations in that area. Where the newer approaches differ is in complexity, because they can show intricate relationships among items. Gnooks creator Marek Gibney of Hamburg, Germany, started his system as a personal project, using artificial intelligence to produce music recommendations. Now Gnooks as well as Gnoovies (for movies) and Gnoosic (for music) all connect to a central site-Gnod.net, for global network of dreams-and use a similar approach. “If 90 percent of the readers of Douglas Hofstadter also like [Stephen] Hawking, the distance between these two writers in the Hofstadter-Hawking dimension is 0.1,” Gibney says.
All the relative preference information he has comes from users of the site describing their likes and dislikes. Someone names three favorite authors or musical artists or movies; Gnod then shows a series of choices and asks for each if the user likes it or not. As more people express their preferences, Gnod accumulates the information to further refine its suggestions. There is nothing especially new about getting recommendations based on likes and dislikes. What distinguishes Gnod is its use of visual representation to reveal the strength of the recommendation. The distance metaphor shows how closely tied in popularity any two authors are, based on the preference information provided by all users. Graphically, it is representing a set of multidimensional relationships in two dimensions.
Graphics add a powerful capability to searching because of the way people perceive, says Phil H. Goddard, a director at Human Factors International, a Fairfield, Iowa, consulting firm. “Human beings are spatial processors,” he says. For example, most people find it easier to comprehend data in tabular form than in an unformatted list. Graphical front ends to search engines can organize and present information in ways that let users absorb and process it more efficiently. Such tools, Goddard says, are “capitalizing on the effect that we see patterns and learn patterns and parse patterns faster than we process text.”
But finding the proper form to display information in a manner that users can quickly grasp is not easy. Change the audience and the types of questions they might ask, and the visual form much also shift. Endeca, a Cambridge, MA, company focusing on guided navigation, has a demo of its technology that shows how visual search could aid an average person choose a wine. Any given bottle of wine will have a multidimensional set of characteristics, such as origin, flavors, vintage, and price. Someone entering “zinfandel” as a search term would see horizontally formatted lists of text links grouped by characteristic type. Choosing a U.S. zinfandel would bring up national regions as well as price categories, years, and ratings from wine tasters. There are no icons, no pictures. The reason, says Endeca CEO Steve Papa, is that the more complex the visualization, the more savvy the user needs to be. “Some of these graphical interfaces require more sophistication than most people have,” he says.
To get to the point of considering the right graphical representation, a system must know how data connects. There are various algorithms and approaches; even text-based Google offers a measure of the relevancy that a link has to a search term, and Yahoo! groups links under headings. But what really helps cement relationships is metadata-that is, information about the nature and structure of data.
“The biggest challenge with visualization is data overload,” says Greg Coyle, general manager of Ancubis, a Cambridge, UK-based developer of search visualization tools. “When the data sets get large, it’s a challenge to usefully visually represent that and not scare the hell out of the user.” Effective presentation requires understanding how to categorize it and relate one piece of information to another. So developers need descriptive information about the underlying data that people want to search.
Anacubis gets this metadata from business partners such as Dunn & Bradstreet. Each type of data is represented by an icon, and related data icons are connected by lines. Information arrives at the Anacubis software in a proprietary format using extended markup language, or XML. The Anacubis application can show a company background as one icon, connected people icons for corporate officers, another image for recent financial results, and so on. The relationships make explicit the links that one might find by reading multiple reports and correlating the results.
A demonstration version of Anacubis’s capabilities allows someone to search for performers, writers, books, CDs, or movies by passing the request to Amazon.com. Results appear as icons. Users can then right-click any icon and request a related search on Google. Google, however, provides only text information-without metadata. To narrow and sharpen the results, Anacubis looks at the metadata it does have from the Amazon search; a movie, for instance, will have associated director, cast, and writer. So if someone looks for an additional information about a DVD version of Lawrence Olivier’s version of Hamlet, Anacubis could search Google for the terms “Hamlet,” “DVD,” and “movie.”
But visualization front ends are not magic solutions for those who want to find something; a combination of text and Boolean commands can quickly resolve a complex search. Consider, for instance, a wine shopper searching for an Australian merlot with a hint of oak for $7.99. Using visualization would likely take multiple steps to move through the screens of information. And finding the best combination of representation and data organization can be tough.
“Making things findable and understandable is really hard,” says Sue Aldrich, senior vice president at Patricia Seybold Group, a consulting firm for customer-centered business processes. “It’s amazing that we find anything.” So even if users gain better understanding of the data before them, developers should not expect their work to be as easy as pie or pie charts.