Visualization of Beijing Air Quality


Play with it, grab the source, and fork the GitHub repo!



  • Hourly samples over 14 days
  • Pollutant concentration by location (27 monitor stations)
    1. SO2
    2. NO2
    3. PM10
    4. PM2.5

Extracted from map:

  • Outline of Beijing map
  • Coordinates of each monitor station


I aim to clarify the following aspects of the data with this visualization:

  • Which part of the city is more heavily polluted.
  • Which part of the city suffer from the highest intensity of a given type of pollutant.
  • What time in a day is the air most polluted in Beijing.
  • What time in a day is the air most polluted, for a given place and / or a given type of pollutant.

My design consists of three interrelated parts:

  • A map: map is best for encoding locations. Users can easily identify from the map which parts of the city suffer from serious pollution.
  • A stacked radial area chart: encode time (24 hours) in a circle, and stack the intensity of each pollutant by hour. From this chart, users can see the trend of pollution intensity (both on a by-pollutant basis and of all pollutants) in a day, and a rough level of overall pollution at a certain monitor station.
  • A “tiles” chart: a 14*24 table, where each cell displays the pollution intensity during a given hour in a given day. This is closest to the raw data.


There are two dimensions to select: station and pollutant.

Users can select a station by clicking on the corresponding circle on the map, or select “all stations” by clicking on the rest of the map.

Users can select a type of pollutant by clicking on the pollutant name on the bottom, or by clicking on the corresponding circular area in the stacked area chart.

What My Visualization Does Not Have

Besides what I have implemented, I also had the following initial designs which I abandoned, in order to simplify the visualization and avoid distractions.

  • Split up each tile into four smaller tiles, each representing a type of pollutant. This is distracting because the value of different pollutants are less comparable. For example, SO2 value seems always lower than the rest pollutants, given the time and station.
  • For each monitor station on the map, somehow encode the time-series value, for example stacked radial area charts.

Lessons Learned

  • Design the data structure with extra care before anything else, and everything else would be a breeze. Modifying data structure later on is pain.
  • The most complicated visualization, or a chart with the largest amount of information, is not necessarily the best choice. Too much information can be confusing and distracting. Instead, it is usually a good idea to break down the data into logical groups, present one group at a time, and let the user explore them through interaction.
You could leave a comment if you were logged in.
public_course/visclass_f12/assignment/a02/chengye/start.txt · Last modified: 2012/12/03 23:57 by f12_chengye