Special Section: InfoVis 2004

Information Visualization (2005) 4, 96–113. doi:10.1057/palgrave.ivs.9500091

A rank-by-feature framework for interactive exploration of multidimensional data

Jinwook Seo1,2 and Ben Shneiderman1,2,3

  1. 1Department of Computer Science, University of Maryland, College Park, MD 20742, U.S.A.
  2. 2Human–Computer Interaction Lab, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, U.S.A.
  3. 3Institute for Systems Research, University of Maryland, College Park, MD 20742, U.S.A.

Correspondence: Ben Shneiderman, Department of Computer Science, A.V. Williams Building, College Park, MD 20742, U.S.A. Tel: +1 301-405-2680; Fax: +1 301-405-6707; E-mail: ben@cs.umd.edu

Received 30 October 2004; Revised 15 January 2005; Accepted 1 February 2005; Published online 19 May 2005.

Top

Abstract

Interactive exploration of multidimensional data sets is challenging because: (1) it is difficult to comprehend patterns in more than three dimensions, and (2) current systems often are a patchwork of graphical and statistical methods leaving many researchers uncertain about how to explore their data in an orderly manner. We offer a set of principles and a novel rank-by-feature framework that could enable users to better understand distributions in one (1D) or two dimensions (2D), and then discover relationships, clusters, gaps, outliers, and other features. Users of our framework can view graphical presentations (histograms, boxplots, and scatterplots), and then choose a feature detection criterion to rank 1D or 2D axis-parallel projections. By combining information visualization techniques (overview, coordination, and dynamic query) with summaries and statistical methods users can systematically examine the most important 1D and 2D axis-parallel projections. We summarize our Graphics, Ranking, and Interaction for Discovery (GRID) principles as: (1) study 1D, study 2D, then find features (2) ranking guides insight, statistics confirm. We implemented the rank-by-feature framework in the Hierarchical Clustering Explorer, but the same data exploration principles could enable users to organize their discovery process so as to produce more thorough analyses and extract deeper insights in any multidimensional data application, such as spreadsheets, statistical packages, or information visualization tools.

Keywords:

Rank-by-feature framework, information visualization, exploratory data analysis, dynamic query, feature detection/selection, graphical displays

Extra navigation

.
ADVERTISEMENT
Interactive Visualization and Data Analysis, Masters program at Danube University Krems, Austria