Information Outlook, Vol. 6, No. 12, December 2002
Visualization of Information Resources for Professionals
By Tim Bray
Tim Bray has 20 years of experience in the software industry, and is widely recognized as an expert in the problems of searching and retrieving information from large textual databases. In 1999, he founded Antarctica Systems, a pioneer developer of data visualization technology, where he is currently chief technical officer. In 2001, he was nominated by Tim Berners-Lee, inventor of the World Wide Web and director of the Web Consortium, to the Consortium's Technical Advisory Group, which serves an architectural oversight function for the whole Web.
Mapping the Future
Decades ago, when I was in college, our library got a wonderful new addition to its computer systeman online catalog! You could type in an author's name or the title of the book you were looking for and get all the details about that book. You could even find out if it was in the library that day.
Of course, the system wasn't perfect. If the author was J. Green, you got screenful after screenful of matches. If the title was Organic Chemistry, the same thing would happen. If you didn't know the author and title, but you wanted a good basic book on forensic accounting, there was a "keyword" capability, but the usual result was more screenfuls and not much joy.
The whole thing wasn't much to look at, either. The screens were busy and monochromatic, with bits and pieces of information all over the display and not much attention paid to typography.
Then there were those call numbers, with subject codes and so on. QA76.73W39 or 822.23 might mean something to the professional at the desk in the corner, but they were no help when I was looking for that introduction to forensic accounting.
When I go into a research library or the local info resource center today, things are a lot better. There's that electronic catalog, just like when I was a student, but there are also databases about businesses and investments, online research in science and medicine, patent inquiry systems, and, of course, the huge, noisy arena of the Web at large.
But the system still isn't perfect. If the author is J. Green or the title is Organic Chemistry or I just want to get the basics on forensic accounting, I'm still going to be looking at way too many screens full of text. And those screens still aren't much to look at; they're often busy and monochromatic, poorly laid out, and full of complicated subject codes that don't mean anything to me.
But things do get better, and the evidence is right in front of you every day if you work with a computer. When I was an undergraduate, if you were one of the lucky few with computer access and you wanted to see what was on your hard drive, you had to type something like "DIR /W" or "ls l" and deal with a long ugly list of all the files, accompanied by unhelpful metadata such as "rwxr-xr-x" and "57/119."
Now we have graphical user interfaces (GUIs), including icons, windows, wastebaskets, file folders, point-and-click, colorful displays, and drag-and-drop. The GUI is the world's most successful application of information visualization, even though that's not how most people think of it. We've gotten so used to it that we don't realize that those hard drives don't really contain any buff-colored folders or cute little wastebaskets or hands that show things are being shared; it's all a clever visual metaphor to help us use our own data.
When Steve Jobs and Bill Gates were running around in the late eighties telling us we needed graphical user interfaces, they didn't admit that they were selling information visualization. They just said that more people would get better use out of computers if they had visual interfaces. It turns out they were right.
So why, when I sit down in front of a database or an OPAC, a search engine, or a patent catalog, am I still typing in queries and pressing "Enter" and reading screen after screen full of boring, ugly listings? Why can't I have a visual interface for online data just like I have for my computer's hard drive? I'm pretty sure that we can have a visual interface, once we have the bugs worked out. But it's a hard problem, and we've only really started to work on it.
The first question is what kind of a visual interface we need for shared online data. One obvious answer would be to try to replicate the successful "desktop metaphor" that's served us well for over a decade for our personal data. This has been tried, but it has not worked out well. When a database has a couple million objects and a taxonomy eight levels deep with 300,000 nodes, it's hard to believe all that will fit on a desktop. In a sizable database, even simple searches can produce thousands of resultsmore than you can really handle with little folders.
There's another point toosubtle but important. When you've got files on your desktop, the computer knows relatively little about them: their name, size, date, and (maybe) what kind of thing they are. In any kind of online database, we know a lot more: subject headings, retrieval frequencies, revision dates, author, title, publisher, access rights, and other things. That's one of the reasons those screens are so busy. While the desktop GUI uses little pictures of pieces of paper and abacuses to tell you what kind of data are in each file, there's no obvious way to use a little picture to tell you what the subject is or how popular it is.
Maybe a visual interface to online information is inevitable, but the desktop metaphor isn't up to handling the increased size and richness of online data. So you might decide that some really new thinking is required to map out the visual interface of the future.
Science fiction movies have presented all sorts of gripping and wonderful ideas about information interfaces of the future. Mostly, they're three-dimensional, animated, colorful, and look like nothing you've ever seen in the real world. This is surprising, because of the desktop metaphor's success in imitating something well known and firmly rooted in the days before computers. On the other hand, this aggressive three-dimensional interface idea is not entirely nutty. Here's a quiz: the GUI that every information worker in the world looks at all day is the second most popular human-computer interface. What's the most popular? The video gameand the first generation of people who find it natural to use a computer to navigate an imaginary three-dimensional world has already been in the workforce for a while now. In the future, we may find ourselves shooting through 8,000-foot-high purple virtual towers representing Securities and Exchange Commission filings, or burrowing through eerily glowing, twisty virtual dungeon passages representing the results of recent medical research.
Using Maps
When we launched Antarctica, we had two interfaces. One was cyberspace-like and three-dimensional (now retired); the second was based on much older thinking, namely cartography, the art and science of mapmaking. Maps are something most people can understand at a glance without thinking very much. But if you look deeper, there's nothing shallow or simple about mapmaking. Mapmakers are masters of typography, color, shape, detail, and making what's important catch the eye.
Maps can pack incredible amounts of information into each square inch of display. This is an important point and worth a bit of a side trip. In the field of graphical communication, probably the world's leading thinker is Edward Tufte (see www.edwardtufte.com), a popular lecturer and the author of three superb books: The Visual Display of Quantitative Information, Envisioning Information, and Visual Explanations. Tufte's work ranges too broadly and probes too deeply to be summarized here, but one of his key points is that the more information you give people, the better. This is a bit surprising in an age in which everyone complains about information overload, but Tufte has the research numbers to back his point. Assuming you can avoid an overly busy, cluttered display, people will always prefer to be told more rather than less. If you measure how much information you can get onto a page or a screen, maps win; their artful mix of text, graphics, color, and layout can move more data off the page and into the brain in less time than anything else.
How about maps as the basis for the next generation of visual interfaces? This would have been a silly idea up until maybe 1995. To draw a reasonable-looking map, you'd need a graphics workstation that ordinary people couldn't afford. These days, any modern computer can do thousands (often millions) of colors and comes with a free piece of software that contains a powerful graphics engine. That software is the Web browser, usually Microsoft Internet Explorer. But with the advent of graphical PDAs and cell phones and AOL's use of Gecko, we're almost certainly headed back to a multi-browser future.
The idea is that server-side software chews away on your big online repository and figures out how to map it onto a two-dimensional virtual space. Then some human decides what metadata is going to be used to decorate the map and what makes things important enough to be brought to the top. The software combines the human-generated rules with the database output and sends the results to the Web browser as a combination of old-fashioned HTML and graphics files.
This turns out to be quite difficult to do efficiently. Since no one has ever drawn maps of pure information before, we've probably got some more work in front of us to get everything just right. But the graphics are beautiful and the idea is compelling: Instead of guessing what's in the controlled vocabulary and how the taxonomy lays out, you point and click, drilling your way through a half dozen maps to get to the forensic accounting materials. Instead of doing a search and seeing "1 through 12 of 315,073 results," you get a map of your search results, so you can point and click straight into the 45 matches in the organic chemistry section of the database. Instead of sitting down in front of a complex, uninviting query screen for a database you've not used before, you start with a colorful visual overview of more or less what's in it and where.
Now, we may not have all this 100 percent right, and there are certain to be some changes on the road ahead in online information interfaces. But looking back, can anyone really seriously believe that in another 10 years we'll still be typing in queries and dealing with busy, cluttered, ugly, monochromatic, subject-code-laden screen after screen full of query results? DIR/W, anyone?



Feedback form