Lines and Bubbles and Bars, Oh My! New Ways to Sift Data
PEOPLE share their videos on YouTube and their photos at Flickr. Now they can share more technical types of displays: graphs, charts and other visuals they create to help them analyze data buried in spreadsheets, tables or text.
At an experimental Web site, Many Eyes, (www.many-eyes.com), users can upload the data they want to visualize, then try sophisticated tools to generate interactive displays. These might range from maps of relationships in the New Testament to a display of the comparative frequency of words used in speeches by Senators Hillary Rodham Clinton and Barack Obama.
The site was created by scientists at the Watson Research Center of I.B.M. in Cambridge, Mass., to help people publish and discuss graphics in a group. Those who register at the site can comment on one another’s work, perhaps visualizing the same information with different tools and discovering unexpected patterns in the data.
Collaboration like this can be an effective way to spur insight, said Pat Hanrahan, a professor of computer science at Stanford whose research includes scientific visualization. “When analyzing information, no single person knows it all,” he said. “When you have a group look at data, you protect against bias. You get more perspectives, and this can lead to more reliable decisions.”
The site is the brainchild of Martin Wattenberg and Fernanda B. Viégas, two I.B.M. researchers at the Cambridge lab. Dr. Wattenberg, a computer scientist and mathematician, says sophisticated visualization tools have historically been the province of professionals in academia, business and government. “We want to bring visualization to a whole new audience,” he said — to people who have had relatively few ways to create and discuss such use of data.
“The conversation about the data is as important as the flow of data from the database,” he said.
The Many Eyes site, begun in January 2007, offers 16 ways to present data, from stack graphs and bar charts to diagrams that let people map relationships. TreeMaps, showing information in colorful rectangles, are among the popular tools.
Initially, the site offered only analytical tools like graphs for visualizing numerical data. “The interesting thing we noticed was that users kept trying to upload blog posts, and entire books,” Dr. Viégas said, so the site added techniques for unstructured text. One tool, called an interleaved tag cloud, lets users compare side by side the relative frequencies of the words in two passages — for instance, President Bush’s State of the Union addresses in 2002 and 2003.
Almost all the tools are interactive, allowing users to change parameters, zoom in or out or show more information when the mouse moves over an image, Dr. Wattenberg said.
Users can embed images and links to their visualizations in their Web sites or blogs, just as they can embed YouTube videos. “It’s great that people can paste in a YouTube video of cats” on their blogs, Dr. Viégas said. “So why not a visual that gives you some insight into the sea of data that surrounds us? I might find one thing; someone else, something completely different, and that’s where the conversation starts.”
Rich Hoeg, a technology manager who lives in New Hope, Minn., and has a blog at econtent.typepad.com, was so taken with the possibilities for group collaboration that he wrote a tutorial on using Many Eyes as part of his series called “NorthStar Nerd Tutorials.”
“Many Eyes is unusual, because it takes advantage of the collective intelligence of a group to get more out of a data set,” he said. For the tutorial, Mr. Hoeg exported enrollment data for graduate engineering students to the site, then used one of the tools there to display the information in various ways.
“I wanted people to understand that you can take the same data and have it tell lots of different stories,” he said.
Dr. Wattenberg noted an example from the site. In charting a particular topic — deaths resulting from human violence in the 20th century — one user originally presented a bubble graph in which the size of the circles represented the number of casualties tied to an event — for instance, World War I or World War II. After discussion on the site about the substantial growth in population during the 20th century, the originator offered two new time-based visualizations of the data, one a line graph and the other a stack graph — plotting the number of casualties against this growing population.
“You could see a new downward trend emerge,” Dr. Wattenberg said. “Violent deaths declined in the latter decades of the century. It’s a slightly more optimistic view.”
Ben Shneiderman, a professor in the computer science department at the University of Maryland, College Park, and a pioneer in information visualization, says sites like Many Eyes are helping to democratize the tools of visualization. “The gift of the Internet is that everyone can participate, and the tools can be brought to a much wider audience,” he said.
Presenting results in a static spreadsheet or table may do the job. “But sometimes it’s like driving with your eyes closed,” he said. “With visualization, it might be possible to open your eyes and see something that will help you” — for instance, patterns, clusters, gaps or outliers in the data.
“The great fun of information visualization,” he said, “is that it gives you answers to questions you didn’t know you had.”