High-dimensional Visualizaiton in Scientific Data

Description

When doing experiments, scientists collect information from all kinds of aspects, in order to analyze and explain the observed phenomena. As a result, large amounts of high-dimensional data are produced. Visualization provides an efficient way for mining the many variables, thus accelerating the knowledge discovering process. In our lab, we help domain scientists process and analyze their experiment data, where high-dimensional visualization techniques play an important role.


Multivariate Analysis of Scalar Fields

3D scalar field is a common kind of data in scientific researches. It refers to a data volume where each point in the space are measured with multiple attributes. It is challenging to investigate those attributes, along with their relationships in real-world applications. We facilitate the analysis by providing a visualization that combines parallel coordinates with dimension-reduced projections [2,3]. The parallel coordinates present data values across multiple attributes, while projections are intuitive for displaying data relationships. There is usually a direct rendering of the data volume, whose color mapping is decided using the so-called transfer function. With the parallel coordinates, users can set up different color mappings through brushing on the attribute values. It enables them to quickly link the spatial view with attribute information, which can reveal insights of scientific data patterns. Besides the application, we've also developed a scalable and parallel system [1] on distributed many-core environments, to accelerate the multivariate volume rendering. We evaluated our system with the help with domain experts, and received very positive feedbacks.


Multivariate Analysis of Flow Fields

Flow field is another important kind of scientific data, which describes flow patterns in the ocean and the atmosphere. It helps to interpret complicated phenomena like the emission of carbon dioxide and the spread of pollution. As a supplement, we often extract the pathlines, namely are emulated paths followed by particles in the flow. All kinds of attributes come along with the pathlines, but there is a lack of efficient means for handling them in the flow visualization practice. Regarding that, we proposed Lagrangian-based Attribute Space Projection (LASP) [4], where high-dimensional visualization techniques are applied to help analyze the pathline data.

In LASP, we regard each pathline as a point in the feature space. Distances are measured by accumulating multi-variate distances between pathlines across multiple time stamps. With the distances, we use a projection to show similarities among the pathlines ((a) in the above image). We also show the detailed attribute information with a scatterplot matrix ((b) in the above image). Users can link the features with pathlines via brushing and linking interactions.

However, it's somehow low-efficient to explore data features depending merely on users' interactions. Automatic feature detection algorithms can be adopted to facilitate the analysis. We propose a novel approach called FLDA [5] for unsteady flow field studies.

It is based on the Latent Dirichlet Allocation (LDA) model, which is often used in text analysis for topic modeling. In our approach, pathlines and attributes are regarded as documents and words respectively. Topics are generated based on the LDA model, each containing a group of similar pathlines. In the above image, the left part shows the projection of all pathlines, where each small snapshot displays pathlines belonging to one topic. Dimensional details are also provided for each topic as shown in the right part. Different from other methods, our approach clusters pathlines with probabilistic assignment, and aggregates the attributes to meaningful patterns (i.e. topics) at the same time. We evaluated our approach using real-world datasets, and got positive results that demonstrate its effectiveness.


Citation

  1. Hanqi Guo, and Xiaoru Yuan. Design and Application of PKU Scientific Visualization System. In Proceedings of National Annual Conference on High Performance Computing (HPC China 2013), pages 551-558, Guilin, China, Oct. 27 - Oct. 31, 2013. (in Chinese).
    | Paper: pdf (1.3 MB) |
  2. Hanqi Guo, He Xiao, and Xiaoru Yuan. Scalable Multivariate Volume Visualization and Analysis based on Dimension Projection and Parallel Coordinates. IEEE Transactions on Visualization and Computer Graphics, 18(9):1397-1410, 2012.
    | Paper: pdf (860 KB ) |
  3. Hanqi Guo, He Xiao, and Xiaoru Yuan. Multi-Dimensional Transfer Function Design based on Flexible Dimension Projection Embedded in Parallel Coordinates. In proceedings of IEEE Pacific Visualization Symposium (PacificVis 2011), pages 19-26, Hong Kong, March. 1-4, 2011.
    | Paper: pdf (1.13 MB) | Video: mp4 (4.93 MB) |
  4. Hanqi Guo, Fan Hong, Qingya Shu, Jiang Zhang, Jian Huang, and Xiaoru Yuan. Scalable Lagrangian-based Attribute Space Projection for Multivariate Unsteady Flow Data. In Proceedings of IEEE Pacific Visualization Symposium (PacificVis 2014), pages 33-40, Yokohama, Japan, Mar. 4-7, 2014.
    | Paper: pdf (1.6 MB) | Video: mp4 (5.1 MB) |
  5. Fan Hong, Chufan Lai, Hanqi Guo, Enya Shen, Xiaoru Yuan, Sikun Li. FLDA: Latent Dirichlet Allocation Based Unsteady Flow Analysis. IEEE Transactions on Visualization and Computer Graphics (SciVis'14), 20(12):2545-2554, 2014.
    | Paper: pdf (4.5 MB) |

  6. © PKU Visualization and Visual Analytics Group 2008-2016