Effective visualization of large multidimensional datasets

Christopher G. Healey
A new method for assisting with the visualization of large multidimensional datasets is proposed. We classify datasets with more than one million elements as large. Multidimensional data elements are elements with two or more dimensions, each of which is at least binary. Multidimensional data visualization involves representation of multidimensional data elements in a low dimensional environment, such as a computer screen or printed media. Traditional visualization techniques are not well
more » ... to solving this problem. Our data visualization techniques are based in large part on a field of cognitive psychology called preattentive processing. Preattentive processing is the study of visual features that are detected rapidly and with little effort by the human visual system. Examples include hue, orientation, form, intensity, and motion. We studied ways of extending and applying research results from preattentive processing to address our visualization requirements. We used our investigations to build visualization tools that allow a user to very rapidly and accurately perform exploratory analysis tasks. These tasks include searching for target elements, identifying boundaries between groups of common elements, and estimating the number of elements that have a specific visual feature. Our experimental results were positive, suggesting that dynamic sequences of frames can be used to explore large amounts of data in a relatively short period of time. Recent work in both scientific visualization and database systems has started to address the problems inherent in managing large scientific datasets. One promising technique is knowledge discovery, "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data". We hypothesise that knowledge discovery can be used as a filter to reduce the amount of data sent to the visualization tool. Data elements that do not belong to a user-chosen group of interest can be discarded, the dimensionality of individual data elements can be c [...]
doi:10.14288/1.0051277 fatcat:ia43zygplfggrjflklxgspqglm