Introduction to GoTree

This tutorial will guide through the process of writing a visualization specification in GoTree. We will walk you through the decomposition approach of GoTree at first. Then we will explain the components using several example specification one-by-one. After creating the example visualization, we will also guide you how to embed the final visualization on a web page.

We suggest that you follow along the tutorial by building a visualization in the online editor. Extend your specification in the editor as you read through this tutorial. If something does not work as expected, compare your specifications with ones inside this tutorial.

This tutorial is divided into two parts. The first part is the tree visualizations decomposition approach, which is divided into three levels, TreeUnit, Subtree group, and Axis, respectively. The second part is the declarative grammar design to describe one TreeUnit in detail, Visual elements, coordinate system and layout. The following is the structure of this tutorial.

The Data

Hierarchical data are often using the JSON format, because if is sufficient to express the nested structure. The following shows the standard format of GoTree. The other hierarchical data format, for example, newick format can also be transformed to the JSON format.

       {
  "name": "A"
  "children": [
      {"name":"B"
        "children": [
          {"name": "C", "value": 2},
          {"name": "D", "value": 2}
        ]  
      },
      {"name":"E"
        "children": [
          {"name": "F", "value": 2},
          {"name": "G", "value": 1} 
      }
    ]
 }

The above hierarchical data have seven nodes, each of which contains one unique attribute name and one numberical attribute value. In the instance data, only leaf nodes contain the numberical value, and the numberical value of the other nodes is the sum of all its descendant node.

Tree Visualization Decomposition

Tree visualizations encode parent-child relationships through one of a few different visual representations, such as implicit relative positions of nodes (e.g., adjacency or containment) or explicit visual elements (e.g., straight lines or arcs for edges).

The characteristic of the hierarchical data is that the children of one hierarchical data are still independent hierarchies with the same topological structure. Accordingly, tree visualizations have similar characteristics — the subtree of a tree visualization can still be regarded as an independent tree.

For most tree visualizations as shown in the figure above, such as node-link diagram, icicle plot, Treemap, and etc., the visual representation of one subtree within the whole visualization is same to the results by visualizing it as a standalone tree visualization. The tree visualizations with this characteristic can be divided recursively into several basic components. Assembling the basic components can also reproduce these original tree visualizations.


TreeUnit

Based on such characteristic of hierarchical data, we define the basic components of tree visualizations as TreeUnit. Defining the tree visualizations as the group of TreeUnit is actually separating the inner parent-child relationships.

With decomposing the whole tree visualizations into small components, our method allows describing the tree visualizations in a fine-grained manner. Our method can also facilitate users to explore the possible layout of tree visualizations flexibly, because the encodings of parent-child relationships in different TreeUnit are independent. The figure below shows one TreeUnit with two subtrees.

A TreeUnit contains the root node and its subtrees. In our definitions, root refers to the specific visual element of tree visualizations, but subtrees only represent its occupied space, because we do not care about the hierarchical structure within the subtree of a TreeUnit. To determine this TreeUnit the required attribute values we need to compute include x, y, width and height of root and each subtree.

Specifically, the positional attributes are relative to the upper left corner of the whole TreeUnit and all these attributes are relative values. In particular, we do not need to determine the x and y of the whole TreeUnit, which is determined by the computations of its upper level.

TreeUnit only focus on the relationships between parent and child nodes and sibling nods. However, some tree visualization layouts (e.g., flexTree, force-firected tree, and Bubble Treemap) also consider the other interrelationships among the nodes.


Subtree Group

Based on the above analysis in TreeUnit, TreeUnit can be regarded as the basic component of tree visualizations, and assemble the TreeUnit recursively can reproduce the original tree visualization. As a result, we only discuss the layout decomposition for one TreeUnit.

Tree layout is to determine the nodes attribute values in tree visualizations. Specifically, the visual attributes include the positional and length of the nodes, which need to be determined along each axes in the coordinate system. We divide the tree layout strategies of one TreeUnit into two categories. For the first category, the visual attributes of the visual elements on different axes are interdependent, and the layout methods in the second category computes the encoded attributes along each axis independently.

The tree visualizations in the first category include Circle packing tree visualizations, Squarified Treemap, Voronoi Treemap, and etc. The figure below shows these three tree visualization examples.

Circle packing tree visualizations need to compute and optimize the Euclidean distances (x1 - x2)2 + (y1 - y2)2. Squarified Treemap and Voronoi Treemap need to opmimize the aspect ratio of the width and height (width / height). In this way, tree layouts must compute these attributes together. The algorithms of axis-interdependent layout methods vary greatly and are developed according to different criteria. In this paper, we do not decompose these tree visualization methods further.

A TreeUnit is composed of root and its subtrees. The relative positions among these elements encode the relationships between parent and child and the relationships among subtrees. To separate these two relationships apart, we define the subtree group as the collection of the subtrees under the root node. We define the subtree group as the collection of the subtrees under the root node to separater these two relationships apart. As shown in the figure below, the parent-child relationships are transformed to the relationships between root and subtree group. The relationships among siblings are transformed to the relationships among the subtrees within subtree group.

Axis

Our method further decomposes the tree layout methods in the second category along each axis independently. Specifically, both the relationship between root node and subtree group and the relationships among the siblings can be decomposed along the axis.

Parent-child relationship

The parent-child relationships along each axis can be divided into three categories, as shown in the figure below. The first one is that the root node includes subtree group, the second one is that the root node juxtapose with subtree group, and the third one is that the root node is placed within subtree group. Note that all the relationships are specificed along with the axis instead of the whole visual elements.

To further determine the positions of the nodes of each relationship, we added some other parameters borrowed from the box model of CSS. Under the relationship that root include the subtree group, we add padding parameter. Under the relationship that the root juxtapose with subtree group, we add the position parameters, including top, bottom, left, and right, to determine the relative positions between root and subtree group. We also add margin parameters to determine their accurate positions. Specifically, padding creates extra space within an element, while margin creates extra space around an element. We can get the detailed relationships with setting the margin parameters as positive (separation relationship), zero (adjacency relationship) or negative (overlapping relationship). Under the relationships that the root is placed within subtree group, we add alignment parameter, and the alignment parameter can be set to left, right, or center along the horizontal axis, and top, bottom, or center along the vertical axis.

Sibling relationship

The sibling relationships along each axis can be divided into two categories along each axis. The first one flattens the siblings while the second one aligns the siblings according to the specified parameters. For the flatten layout, we add sorting parameters to specify the orders and margin parameters to specify their relative positions. For the align layout, we add the alignment parameters to specify the alignment criteria. The detailed parameters are shown in the figure below and they are similar to the definitions in the relationships between root and subtree group.

GoTree Grammar Design

The goal of our grammar’s design is to allow users to specify the visual appearance of tree visualizations without considering the implementation details. In our declarative grammar, we provide users with flexible parameters to ensure its expressiveness and allow users to customize the tree visualizations. To reproduce the original tree visualizations, the specification of our declarative grammar in this section is consistent with the tree visualization decomposition approach above.

Various parameters in our declarative grammar enhance the expressiveness of the grammar. However, it also increases the complexity of the declarative language and results in a steep learning curve for users. To improve the usability and simplicity, we provide default values for some parameters, underlined in our declarative language, to reduce users' programming burdens.

After setting the default values, users only need to specify the Relation under the root and subtrees. In particular, for the length parameters of declarative grammar, including SubtreeWidth, SubtreeHeight, RootWidth, and RootHeight, users need to specify the encoding values. The margin and padding in the declarative grammar are relative values. The margin between the root and the subtree group is relative to the whole TreeUnit or the root node. The margin between the subtrees is relative to the subtree group. The padding is relative to the whole length of the outside container.

Visual Element

The visual element contains several parameters, including link, node, color, RootWidth and RootHeight. Link can be set to be hidden, straight, orthogonal, arc (curve), bezier (curve), etc.

Node can be set to be hidden, rect, circle. Color can be set to null, which means that the color does not encode attributes, or color can encode value, depth, height and width. Similar to the color, RootWidth and RootHeight can also encode the value, depth, height, and width. When the RootWidth and RootHeight do not encode the attribute, they will be set to be adaptive.

Coordinate System

The parameters of Coordinate system contain Category, PolarAxis, PolarInnerRadius. Category can be cartesian (coordinate system), and polar (coordinate system). If the category of the coordinate system is polar (coordinate system), the PolarAxis can be x-axis and y-axis, which means that the angular direction of the polar coordinate system.

Layout

To determine the layout of tree visualizations, users need to specify the parameters of each axis independently. Each axis contains Root, which denote the relationship between root and subtree group, and siblings, which denotes the relationship among subtrees within subtree group. The parameters Alignment (center, top, bottom, left, right), Margin, Padding and Position (top, bottom, left, right) determine the position of the nodes accurately.

Example

Next we will take the icicle plot as an example to explain the grammar design of GoTree. Icicle plot is a traditional implicit tree visualization and frequently used in our real world applications. Next, we will describe the icicle plot according to the above decomposition approach.

Icicle plot

The visual elements of icicle plot only contains the node and do not contain the link, because the parent-child relationships are encoded in the relative position between the nodes. This method improves the space utilization efficiency. The color, width, and height of nodes could encode different attributes of the hierarchical data, such as value, depth, height, etc. Based on the decomposition, the visual element aspect of icicle plot is shown below.

       {
  "Element": {
      "Node": "rect",
      "Link": "hidden", 
      "Color": "depth", 
      "RootWidth": "adaptive", 
      "RootHeight": "adaptive"
    }
  }

The coordinate system of icicle plot is the cartesian system. For the cartesian system, we do not need to set the other attributes except the category.

       {
  "CoordinateSystem": {
      "Category": "cartesian"
    }
  }

The layout of icicle plot could be decomposed into the TreeUnit, because the visualization results of the subtree are same to visualizing it as a standalong tree visualization. For each TreeUnit, it is decomposed along the axes respectively. The icicle plot is in the 2d visual space, as a result we decompose it from two axis respectively (horizontal axis and vertical axis) in the following.

The figure below shows the decomposition results of one TreeUnit in icicle plot. <1> The relationship between Root and subtree group. From the horizontal direction, the root node includes the subtree group and the padding parameter is 0. From the vertical direction, the subtree group juxtaposes at the bottom of the root nodes, and the margin parameter is also 0. <2> The relationship among the subtrees within subtree group. The layout approach flatten the subtrees along the horizontal direction, and the margin parameter is 0. From the vertical direction, the subtrees are aligned along the center of subtrees.

Based on the above analysis, we can get the following layout configurations.

       {
  "Layout": {
      "X": {
        "Root": {
          "Relation":"include", 
          "Padding":"0"
        },
        "Sibling": {
          "Relation":"flatten", 
          "Margin":"0"
        }
      },
      "Y": {
        "Root": {
          "Relation":"juxtapose"
          "Margin":"0"
        },
        "Sibling": {
          "Relation":"align", 
          "Alignment":"top"
        }
      },
      "Mode":"bottom-up"
    }
  }

sunburst

Although Sunburst tree visualization is quite different from the visualization results, transforming the icicle plot to sunburst only need to change the attribute values of the coordinate system category.

       {
  "CoordinateSystem": {
      "Category": "polar",
      "PolarInnerRadius": "0",
      "PolarAxis": "x-axis",
      "PolarAxis": "x-axis"
    }
  }