At its core, every chart, every graph, every visualization that we make, it is all about one thing, and that thing is communication. Communication to our audience, even communication to ourself. If the audience does not understand with relative ease the chart or the graph presented, then we have not been clear in our communication of the chart or graph. Granted, a chart is only one medium of communication. There is also the context in which the presentation of the chart is made, how it relates to the whole message of the particular presentation, and how the presenter verbally communicates the chart and hence its message. However, that topic is best saved for another time.
As the topic indicates, we are going to touch on the aspect of communicating clear message in charts. A chart is like a visual picture, and as the saying goes, a picture is worth a thousand words, and so it does with a chart. Unlike a painting however, it should not take a person more than just a few moments to correctly identify and understand the message that a chart communicates. A chart needs to be simple enough to communicate its message almost instantaneously. Ideally, if the audience can understand a chart without the need for the presenter to explain it further, then that is the ultimate success in communicating clear message.
This principle is important when a person needs to communicate a message using limited medium, say a brochure for example. A typical brochure has very limited space. Plus, it is normally meant for mass distribution, which means the message originator would find it hard to communicate their message face to face. However, in the corporate setting, like the one I am used to, this is not so critical. More often than not, the presenter would have ample opportunity to explain his or her chart. Yet, this does not mean that one should care less about the way one presents the information.
William Playfair’s Chart
The prevalence of today’s information age has allowed us to make charts in an instant, using a variety of software and tools. It was not so long ago that chart makers would need to properly think about the visual medium they needed to use to communicate their message. Back then, making just one chart was difficult, as a chart maker would need to draw and plot a chart by himself. Take for example, this visual from the year 1786 by William Playfair, the founder of statistical graphics.
The chart clearly communicates the message of the status of England’s exports and imports with Denmark and Norway, which greatly favours England from about 1754 onwards. Playfair clearly marked the areas between the two lines using two distinct colours to indicate trade balance that are in favour and against England. That was the message in his chart. Audience who read this chart would get this message instantly and can then began to ask other follow-up questions, such as what factors contributed to this change of events, or what type of trades were used to generate this information. If those areas were not shaded, a reader may still arrive at the same conclusion – that trade is in favour of England past 1754 – but he may require more time to arrive at that conclusion. Worse still, an inexperienced communicator may use a chart which not only does not communicate the message, but also use an entirely wrong chart.
Pie Chart and Bar Chart: An Ongoing Debate
I will now touch upon some basics of charting through the use of the classic pie chart versus bar chart debate. Although I may visit this topic on its own in the future, it is important here due its application in the real-life example that I will show at the end of this post.
Refer to the pie chart above. If the data labels are not there, would you be able to distinctly differentiate which item is larger, Item 1 or Item 2? Those who are familiar with pie charts may note that items in a pie chart are usually listed according to their size, and as such, Item 1 is definitely larger than Item 2. However, a pie chart may also be constructed in any order, and a typical audience would not know of this fact. Item 1 is indeed bigger than Item 2, but only slightly so. Their sizes are so close that they are almost indistinguishable. One can argue that the data label is included in the chart and that one can ascertain the individual items from the data label. That is indeed true. However, you do not want your audience to think about the mechanics of the chart. You want to engage the audience with the message of the chart. You want them to ask engaging questions from the message of the chart. Furthermore, even with the data label, we humans catch visual message (such as shape and colour) far clearer and faster compared to the written word.
In contrast, let us look again at the same data, but presented using a bar chart, as shown above, along with the original pie chart for comparison purpose. In comparing the two, your eyes would immediately be drawn to the colours. For instance, you instinctively know that the blue-coloured Item 1 in the pie chart corresponds to the blue-coloured Item 1 in the bar chart, without even needing to refer to neither the data label nor the category label. This is an example of how visual cues are more powerful than the written word. Look at the bar chart again. Is it not easier now to distinguish between Item 1 and Item 2? Even though the difference in their sizes is small, your eyes can still easily and immediately determine with definite certainty that Item 1 is larger. That is not so easily achieved using the pie chart.
Now, why is it that bar charts are easier to read compared to pie charts? It all comes down to how the brain perceives lengths and areas. In a bar chart, the brain only needs to compare one dimensional data, which is the length of each bar in the chart. The width of each bar is always the same and only serves to distinguish each bar from another. On the other hand, on a pie chart, the brain has to process two dimensional data, and that is the area covered by each item in a pie chart. The human brain is not adept at comparing exact magnitudes when two dimensional data, such as areas, are involved. It is arguable, though, as you will see in the next few paragraphs.
So, does that mean pie charts are bad? Not necessarily. Again, to emphasize the need for clear message in charts, the type of chart that you use must be align with the message of your chart. If your message includes distinct comparison of multiple items, then perhaps a bar chart would best serve your need. Look at the chart below and note how easy it is to compare the individual bars in each bar chart and among the two bar charts. Again, that task is more challenging using the pie charts.
When is it good to use pie charts, then? In my experience, I find that pie charts are great when trying to compare slices of the pie in comparison to the whole data when the slices have values close to ¼, ½, and ¾ of the total chart. Refer to the following chart and note how easy it is to observe that Item 1 in the pie chart constitutes about 50% of the whole data, or that Item 2 and Item 3 each contributes about a quarter to the whole data. One can also easily combine Items 2 and Item 3 and say that together they form about half of the data. These are considerably more difficult to determine in the bar chart.
The issue of pie chart’s usefulness is an on-going debate, in my opinion. It is a topic best reserved for another day in this blog. If you are interested to read further on this topic, I suggest you read research articles relevant to the matter. Here is one by Tufts University which used fNIRS brain scans to measure brain responses when presented with different types of charts. The research observed that there is ‘very little difference in response time and error’ when it comes to bar chart and pie chart. Here is another article, a research paper from the Applied Cognitive Psychology journal, in which the researchers arrived at the conclusion that pie charts are better than bar charts ‘when tasks other than direct magnitude estimation are required’. Interesting, is it not? It is a topic which we may visit again someday.
The topics discussed above cover two very important aspects of communicating clear message in charts, and those are:
- Highlight the message visually within the chart itself, and
- Use the right type of chart that aligns with the message
I used William Playfair’s chart to show a good example of highlighting the message visually within the chart itself. I have also shown how a right type of chart (in this topic, bar vs. pie) can visually help to convey the message better. I believe it is only fair that I conclude with a real-life example using the lessons that we have established in this topic.
Designing Charts using Clear Message as A Guide
Take a look at the visual above. This visual constitutes a real-life example that I experienced in my career. The data has been modified to avoid any possible identification of company and personnel based on the source data. I believe that this modified example would provide a good example of the importance of clear message in charts. The visual shows the causes of packet loss (a type of error in digital communication when signals fail to reach their destination) faced by Company A in 2007 and 2008. The two 3D pie charts show the breakdown of the causes by percentages with the categories listed on the right-hand side of each chart. The two boxes below duplicates the information shown in the charts. Two causes from each year are highlighted in red and white by the chart maker.
From the visual, one can observe that the chart maker indeed has a message that he intended to communicate to his audience. He wanted to show the audience how the top two causes from years 2007 and 2008 (highlighted in red and white, respectively) compare across the two years. Indeed, due to the colour-codes utilized (red and white), one can easily observe that the top cause for 2007, ‘port speed settings’, which accounted for 61.4% of total cause, has dropped to 14% (fourth largest) in 2008. On the other hand, ‘transmission capacity’, which only accounted for 7.9% (fourth largest) of the total cause in 2007, has risen to 24% in 2008 making it the top cause of packet loss for that year. The pie charts would communicate to the audience that ‘port speed settings’ was a major cause of packet loss in 2007 with more than half of the causes attributed to this one factor alone. In 2008, the pie chart communicates to us that although ‘transmission capacity’ was the top cause, several other categories were not that far-off from the top cause, indicating that the team had indeed control the major errors contributed by ‘port speed settings’ in 2007. Although the message is clear in this visual, was the use of pie charts, and in this example, 3D pie charts, the right choice?
In my opinion, 3D pie charts are notoriously difficult for comparison purposes due to the third dimension of depth being included that may hamper our ability to compare data effectively. Take a look at this example below, same data that is displayed using a 2D pie chart and a 3D pie chart. The data labels are intentionally left out so we can compare the slices without interruption.
The blue and red slices are both equal in size (30%). They look about the same in the 2D chart. In the 3D chart however, the red slice looks significantly larger due to the third dimension perspective. It is closer to us, therefore it looks bigger. Let us apply this comparison to the real-life example, without the data labels for comparison purposes.
It would appear that, at least visually, the differences between the 2D and 3D pie charts in this real-life example are negligible. This is due to the fact that there are significant differences in values between the slices closest to the reader and the slices furthest.
Yet, somehow, despite the comparison that can be done with the pie charts, they seem redundant. The text boxes below the charts provide a better method of communicating the message to the audience. If we remove the text boxes, would that help?
No, it would not appear so. The message that the chart maker wanted to convey would be lost without the text boxes. Let us try to enhance the pie charts by using several methods as below.
Method 1 does little to distinguish the relevant categories other than letting the reader to be aware of them in some manner. Both Methods 2 and 3 highlight the relevant categories better but they also pose certain weakness. Method 2 totally ignores non-relevant categories by lumping them using the same colour, thus identifying these categories may prove difficult. Method 3 distinguishes each category but in so doing diminishes the visibility of the relevant categories. In all methods, although the categories that we want to highlight can now be distinguished in a clearer manner, a reader would still have difficulty comparing the values in the charts. A reader would need to refer to the slice colour of a category according to the legend and seek its value in the chart. The eye has to perform this function back and forth between the chart and the legend, for each category. It can be tiring. Furthermore, you will note that the categories listed in the text boxes and the charts are not the exact same. In 2007 for example, ‘transmission capacity’ in the text box is labelled as only ‘capacity’ in the chart. There are several other typo errors of this kind in the visual which you may have noticed yourself. Even though Methods 2 and 3 provide the visual requirement better, they still do not convey the message of the chart maker as clearly possible.
The key to an ideal chart here is to combine the clear message highlighted in the text boxes along with the ability to compare values graphically using a chart. I thought of using two small multiple bar charts for this purpose, but I quickly ran into a problem. The two charts do not use the exact same categories. There are only three categories that appear in both 2007 and 2008, namely ‘port speed settings’, ‘microwave quality’, and ‘transmission capacity’. Category ‘hardware faulty’ in 2007 was split into three categories in 2008, namely ‘mw equipment’, ‘cable’, and ‘node B’. All remaining categories are unique for their own year. Combining the categories from both years as uniquely as I could, below is the result.
The intended message appears to be jumbled up in this chart. In 2007, a reader can easily identify the causes in their ascending order, but that is not easily done for 2008 data. The intended message is somehow lost in this chart. A reader would notice that ‘port speed settings’ was the top cause in 2007, but identifying it as the fourth cause in 2008 would prove difficult. An interactive element can be added to assist the chart further, such as demonstrated in the visual below.
The visual provides the sorting options by year so that readers would be able to easily sort causes in ascending order in the respective years. This seems to be an acceptable visual presentation if the reader has access to interact with this chart. It highlights the top causes from each year and allows for easy sorting with a click of the button. However, I am still not entirely satisfied. The chart maker’s intention was to show that the top two causes from each year were changed (or used to be, as in the case for ‘transmission capacity’) to the fourth cause. Although this can be derived from the interactive visual, a reader would still need to sort the data and figure it out by himself. Even worse, as you may notice, an interactive chart may not be the best medium of presentation. Often, charts may need to be printed which would negate the advantages of interactive charts. Also, charts may need to be shown to an audience where the interactive element of the chart may not be possible to be displayed. For this reasons, it is better to have a chart which portrays the message statically without interaction. The result that I am satisfied with is shown in the visual below.
The most distinctive part of the chart is the display of two inverse bar charts. Some proponents of data visualization may disagree with my choice to display the data in this manner, yet I believe that this chart best conveys the message that the chart maker intended. The sorting from top to bottom causes from both years is easily identifiable in this chart at a glance. The inverse bar chart for 2008 is needed in order to display the two different category sets from both years and still easily compare both years. This chart shows the category data clearer when compared to the interactive chart, which leaves a number of unnecessary zero values. A reader may mistake a zero value to mean that there was no packet loss caused by a category, when in fact that was not the case. Furthermore, the other categories besides ‘port speed settings’ and ‘transmission capacity’ are not critical to the message, thus they do not require as much attention as they were given in the interactive chart. Lastly, the two coloured arrows from 2007 to 2008 serve to home-in the message intended, that ‘port speed settings’, which used to be the major top cause of packet loss in 2007, have now dropped to number four in 2008, while ‘transmission capacity’, which used to be number four in 2007, had now become the top cause in 2008, albeit still within a manageable amount of 24% of overall cause.
I hope this example has shown you the importance of choosing the correct medium of visual display when communicating a chart’s message. Rather than showing you my final choice alone, I have walked you through my thought process as I determined which visual display is the best to communicate the original message of the chart. I do so to demonstrate that at times, a lot of thought can go into chart-making, and in order to make something that looks simple on the surface, such as a clear visual, it actually is quite complex. Simplicity does not mean simple minded.