CNTD and Running Totals: A word of caution

I have a business operation in which people call my call centre. The same person can call the call centre more than once, but we can identify the caller from their details using a field called Caller ID.

I need to know how many unique people called the call centre each month (based on Call Date And Time), but I wanted the cumulative number at the end. I.e. What is the total number of people that have called my call centre.

I tried to do this using probably the most obvious way:

  • Drag Call Date And Time to the columns, and select Month. It appears as a blue pill
  • Right-click and drag Caller ID to the rows and select CNTD (count distinct). It appears as a green pill
  • Apply a quick table calculation to the Caller ID pill, making this a Running Sum.

The line goes up nicely, and if I use the same running sum as a label, I can see that the total number of callers to the call centre at the end of the last month is 34,567.

Except that it’s not. It’s somewhat lower than that.

You see, the 34,567 that appears against the last month is the total number of unique callers in month 1, plus the total number of unique callers in month 2, plus … plus the total number of unique callers in the final month.

If Jane Bloggs called up in month 1 and again in month 2, she contributes twice to the 34,567 total. In other words, Tableau sums the distinct counts for each month rather than providing a total unique count to date.

To get around this, we need to create an LOD calculation to create a new field called First Call:

IF {FIXED [Caller Id] : MIN([Call Date And Time])} = [Call Date And Time] THEN 1

This basically marks the first call for each Caller ID with a 1.

Now, instead of using the CNTD(Caller ID) in our rows, we can use SUM(First Call). Again we can apply a table calculation to give ourselves the Running Sum.

This time, the total will be lower than the 34,567, and will reflect the true number of unique callers. And each month’s total will only count the *new* folks that called during that month.

Shipping containers: How and why

A few hours ago, Andy Kriebel posted his latest #MakeoverMonday dataset, information about the largest shipping companies in the world. I expect this was off the back of the bankruptcy of Hanjin, a situation that leaves over 500,000 containers stranded at sea.

I was in need of cheering up. Big time. And so I reached to Tableau a couple of hours ago to see what could be done. My mood was one seeking fun over integrity, and the result clearly showed this.

My first thought was to create some form of bar chart where the bars were containers. After a brief foray into this visualisation, it was clear that this would look rubbish.

I clicked on the dropdown in the Marks card to see what else might be of use. I saw the “Square” option, and got to work.

You see, shipping containers are rectangular in shape. (Technically, the Square mark should be called Rectangle. But that’s another post.) I imagined my visualisation instantly, and got on with creating it.

You can find the visualisation at the bottom of this post. Or else, click here.

There are three images in the background, which in hindsight could have been combined into a single image. (My image manipulation skills are woefully lacking, as evidenced by the use of PowerPoint as my image-editing-tool of choice.)

There is a PNG of the boat. There’s another PNG of the background (sea and sky). And at the top is a single-colour bar to go behind the parameters. This ensures that the shading of the parameters doesn’t look odd atop a vari-shade background.

Each of these is floating, which is quite apt given the subject matter. The boat is literally floating atop the water, in both senses.

And the visualisation itself is as simple as they get. A “Squares” chart, the size of each square driven by the quantity being displayed. A parameter allows the user to select what to chart; a calculated field uses this parameter to drive its own value:

CASE [Chart by]
WHEN “Cargo capacity (total TEU)” then [Total TEU]
WHEN “Cargo capacity (chartered TEU)” then [Chartered TEU]
WHEN “Cargo capacity (owned TEU)” then [Owned TEU]
WHEN “Ships (total)” then [Total Ships]
WHEN “Ships (chartered)” then [Chartered Ships]
WHEN “Ships (owned)” then [Owned Ships]

A second parameter allows the user to select how many companies to display, and this drives a “Top X” filter on the Company field.

Like the images, the worksheet floats, and it sits in front of all the rest to ensure that it hides the background. Placing it just above the ship gives the impression of shipping containers. Possibly.

I had a lot of fun with this one. The ability to select the Top X companies makes the visualisation statistically woeful. (The area chart should be a percentage of the total, not of some arbitrary subset.) I knew this, but this was kindly reinforced by Andy Cotgreave 😉

The Tableau community cheered me up no end this evening, something I so dearly needed. Nineteen likes on Twitter gave me a much needed boost. Thank you all! Hope you like it.

Adding the polish

At the weekend, I suffered a serious bout of viz envy. It came at the hands of Jeremy Poole in the delightful viz below, an early response to this week’s Makeover Monday. I’ve embedded it at the end of this post.

The visualisation itself is delightful. I would like to think I’d have come up with a similar concept in terms of its visualisation. (I’d already started putting together a graph on similar lines when I saw his viz.) But beyond this, what strikes me most is the way in which the entire view sits together so beautifully as a single body of work.

The fonts, the callouts, the hover actions, the backdrop, they are all beautiful.

As I develop my Tableau knowledge, I think I’m plugging the gaps in terms of my ability to create evermore involved dashboards that add richness to information. I look back on old dashboards with a slight sense of shame, which I think is a good thing. If I didn’t, I wouldn’t be learning and progressing.

However my prowess when it comes to this level of polishing needs work. I think my brother inherited the vast majority of the family’s creative genes. I am, I feel, more methodical, more analytical and certainly less creative.

With visualisations, I think I do well in getting the right objects on the page, and for them to interact well with one another. However I lack in making them look beautiful, making them sing. I would love to learn how to become better in this.

If anyone knows of any resources that might be able to help in this, please let me know. I can see what’s good and what’s not, so it’s not about learning how to spot goodness. It’s about learning the principles that will allow me to create things that are good in the first place.

St. Totteringham’s Day

Yesterday, Arsenal’s ten men came back to draw against Spurs in arguable the most important North London derby in living memory. Before the match my good friend, who is a Leicester fan, suggested an Arsenal victory would have been an ideal result, as it would have left Leicester three points clear of both North London teams, with a game in hand. I argued that a draw was ideal, as it left fewer serious competitors for the title race.

In looking into the numbers, I discovered the concept of St. Totteringham’s Day. This is the day upon which Tottenham mathematically are unable to beat Arsenal in the Premier League. Given that Arsenal have beaten Spurs for the last 20 seasons, there has been such a day in each of those years.

In these years, the earliest it took place was in 2007/08, when Spurs were unable to surpass their rivals on 8 March 2008. (Spurs finished 11th in the table, to Arsenal’s third.) The latest was on 19 May 2013, the last day of the 2012/13 season, when Arsenal secured third spot, one point ahead of fourth-place Spurs.

Here’s my visualisation of the history. I wonder whether there’ll be such a day in this 2015/16 season.

Creating aliases without breaking your blends

Aliases are lovely. They allow you to see what I’ve always termed “friendly names” for field values as opposed to codes and the like. You may have legal entities that are referenced as numbers in the underlying data but you want to display as the names of those entities to the users. You may have state abbreviations that you want to display in full, or a handful of currency codes that you want to show as symbols or full currency names.

I learnt the hard way that creating aliases can be dangerous. Originally, I had no idea that when blending, Tableau uses the alias values as opposed to the raw values coming in on the data feeds. And I learnt that the hard way.

Let’s say my primary data source contains currency codes (EUR, GBP, USD etc.) and this is blended to a secondary data source that contains FX rates that also use those codes. If I add aliases ($, £, €) to the currency codes in the primary data source, then my blend breaks. (I could fix it by applying the same aliases to the secondary data source. But this is not always efficient or wanted.)

To me, this is undesired behaviour. I see aliases as being purely presentational. If I am using a field in two data sources to blend, I want Tableau to respect the raw value rather than some name I’ve chosen to to display to the user.

There is, however, a simple way round the problem: Duplicate your field.

Use the duplicate field to blend on, and apply your aliases to the original field. That way, the integrity of the original field’s name remains intact, and its aliases can be used in presenting the information to the user.

Allow the user to select which chart to see

I’ve often been asked to produce dashboards whose views change fundamentally based on the choices of the viewer.

For example, a user may want to select whether to show a line chart or a bar chart; or they may want to choose which secondary data source to use as the basis for reporting.

Often, such user requirements can be accommodated by being creative with your worksheet designs. And some of the posts here may help with some of that artistry. But sometimes, you may need to resort to presenting different worksheets depending on the user’s selection

This can be done with a few steps.

First, create two worksheets, one for each of the visualisations that the audience may wish to see: Chart 1 and Chart 2, if you will. Once you’re happy with these views, create a string parameter that allows the user to select which chart to show. And give it a list of values that it can take:


And now, create two new calculated fields in the primary data source being used in the two worksheets: Chart 1 selected and Chart 2 selected. The fields will be boolean, giving a TRUE/FALSE value based on whether the appropriate parameter value has been selected.

Chart equation

Now you can use this field as a filter in Chart 1, with a value of TRUE. This will only show the chart if the value of the boolean variable is TRUE, or else if the parameter has been set to Chart 1. (You cannot use a parameter directly as a filter, hence this interim step.)

If you play with the value of the parameter, you’ll see the content of the worksheet appear and disappear. Repeat the above steps for the second chart, and the two worksheets will work in tandem with one another.

Next up, we need to get those charts into a dashboard.

First, they need to sit together in their own container, so you can select either a vertical or horizontal one, either is fine.


And now you can drag each of the worksheets into that container. Click on the dropdown at the top right corner of either chart to show the parameter control. By playing with the parameter value, you’ll see the respective charts appearing and disappearing.

Now unfortunately, the title of a worksheet appears regardless of whether the content of that worksheet is present, so you’ll need to hide the title on both sheets to ensure that the hidden worksheet fully disappears. If you want a title that’s consistent for the two views, then you can put this in as a Text component (see the above selector) above the container you’ve just created. If you want one that varies depending on the chart chosen, you can use a similar technique as this one to create automatically changing titles that sit in their own container.

And there you have it. The user can select a parameter value and the chart that they’re presented with will change in front of their eyes.

Allowing the user to select what to report

Of all of the things I enjoy doing in Tableau, perhaps my favourite is allowing the user to choose what information they want to see.

Do they want to see a report showing the number of sales? Or do they want to see the total value of those sales? Do they want to see the data listed by salesperson or by customer or by region?

This is where parameters are your friend. Perhaps even your BFF.

The solution involves two steps. First, you need to create a parameter that allows the user to select their field of choice.

So first, the parameter. Create a string parameter that allows the user to select which parameter they want to see in the view.


Here, I’ve create a parameter that allows the values “Number of invoices” or “Value of invoices”.

Next, I need to create a calculated field within my data set that uses that parameter to determine its value.



So this is simply using a CASE statement against our new parameter to figure out how to populate the new field “Reporting field”.

Now, you just need to use the new field in your views and make the parameter selectable by the user. When the user changes the parameter, your views will update automatically.

An identical approach can be used for alphanumeric fields. Instead of populating with the reporting field with the number of invoices or the invoice value, you could choose to make it a string field and populate it with either region or salesperson. And you’d change the permitted values of your parameter accordingly.

One limitation of this approach is that the display formatting of the two fields cannot be different. For example, I can’t have the number of invoices display as a number (1,234) and the value of the invoices display as a high currency amount (e.g. £1,234m).

You can do this in labels and tooltips via an additional calculated field:

IF [What to report] = “Value of invoices” THEN “£” ELSE “” END

Add this new field into your label or tooltip immediately after the “Reporting field” value, and it’ll slap an “m” after the number whenever the invoice value is chosen by the user.

To get the axes to be formatted differently based on the user’s selection, you instead need two completely separate worksheets, identical apart from the value that they report. A Display toggle field will then allow you to choose which one to show in your dashboard.

More on that another approach time.

Sorting by a field in a secondary data set

One of my biggest frustrations with Tableau is the limited facility that comes when linking to a secondary data set. One of these is the inability to sort fields from the secondary data source.

Maybe I have my invoiced amounts in my primary data source by relationship reference, and a secondary data source that tells me the salesperson for each relationship reference.

And I want to report the total amount invoiced by salesperson.

I can do that quite simply. I drag the invoice amount into the body of the view from the primary data source. I then flip to the secondary data source, click the chain link to make the data sets linked, and drag the salesperson into my rows.

Now if salesperson had been in the primary data source, the sort buttons in the toolbar would be clickable. Or if you clicked the dropdown arrow on the Salesperson pill in the rows, a Sort option would have been available. We could have sorted the view by amount, showing the salesperson with the highest total invoice amount at the top.

But because the field has been brought from the secondary data set, we’re presented with this.

Grey sort

And this.


And so we can’t sort.

Although perhaps we can, with a little hack.

If you want to sort in descending order, then create a calculated field in your primary data source that which is simply the negative of the field being reported


Now drag that field into the rows next to your Salesperson. As it is a measure, it’ll be treated as being continuous and will be green, and will only be allowed to the right of Salesperson. And because it’s coming from a different data source, it’ll appear as an aggregate field.

Now click the new pill’s dropdown and change it to discrete. It’ll turn blue, and you can now reorder the two row fields so that the aggregate variable appears first.


And now, because you have a field from the primary data source in your rows, you can click the dropdown on the AGG([Inv Amt (neg)]) field and you’ll see a Sort menu option.

Click this and you’re presented with only two options. You can either sort by the data source or you can manually configure the sort order based on the values you see. Most of the time you’ll want to sort by data source. (It only allows low to high sorting here, not high to low. Hence our need to create a negative version of our variable.)

So by sorting in ascending order by a negative field, it’s the equivalent of sorting a positive field in descending order.

Your first column headings will be the negative versions of the values in the table. Now you’ll probably want to hide this first column, so you can simply make the text white and narrow the column to its narrowest possible width.

And there you have it.

Note: if there’s any likelihood that two values in the table will be the same, you’ll need to make them subtly different. I often do this by adding a very small but variable to the calculated field that you just created. I often do this by finding a couple of numbers that are generally quite large and fluid in the data sets and dividing 1 by their product:

1/([Sales volume] * [Outstanding amount]

If you add this to the calculated field that you calculated earlier, then the two figures will be different by a fraction of a penny, and they’ll appear as separate ungrouped items in the table.

The GCloud Dashboard

Far and away my most popular dashboard is my GCloud dashboard, with over 4,000 clicks to date. It’s gone through various iterations over the last 18 months or so, but its primary offering has pretty much settled down into something that I think is flexible and summarises quite well the underlying data. It can be found here.

The GCloud framework was set up by government in 2012 as a way of disaggregating its IT provision. Its intent was to open up provision to a wider set of small and medium-sized enterprises, and become less reliant on the traditional big IT suppliers.

And as its name suggests, its intent was to bring about a much more cloud-based approach to IT provision.

The underlying data is published monthly (usually on around the 20-something of the month for the prior month) and can be found here.

The dashboard itself uses the feature that allows the user to choose their time unit for reporting. Otherwise, it’s a relatively basic presentation of the data. Perhaps it could do with another overhaul. But even so, I think it shows how GCloud has evolved, and gives the user the flexibility to drill down on any aspect.

My friend Alan posted earlier today his interpretation of the data here, which is well worth a read. It’s lovely to see someone presenting commentary based on cuts of data drawn from your own dashboard.

How this monkey discovered Tableau

I started working part-time as a contractor for a small public service company associated with the telecoms industry in January 2013. My role was to lead on all aspects of reporting – internal reporting to understand the overall health of our operations; and external reporting to our stakeholders and shareholders against formally measured KPIs and SLAs.

At first, I cobbled together some stuff in the world of Excel that I know and love. But it was clear that something more strategic would be needed to allow more dynamic, automated and strategic reporting. So we conducted an evaluation of three packages that at the time were all unknown to me: Tableau, Qlikview and MicroStrategy.

Tableau came out on top against our requirements, and so we bought it in May 2013.

Perhaps the most grating aspect of the product was its use of the terms rows and columns. In the spreadsheet world whence I’d come, these terms had a very different meaning to their meaning in the new Tableau world. (In some respects, “across” and “up” might be more suitable terms. Maybe that’s for another post.)

We had two days of dedicated training from the inimitable Emiliano. This was useful to a degree, but in some respects things hadn’t yet gelled with the product. As a newbie, the package was rather daunting, and perhaps some of the training was too advanced for us at the time, coming so soon after we had started using the package.

That said, if nothing else it gave us a huge insight into what the tool was capable of. We went live on Tableau 8.0 and went through a couple of minor upgrades before upgrading to version 9.0 in summer 2015. (Note to self: Must get round to upgrading to 9.2.)

I was doing a couple of days a week with this client, the other three days of my week being spent doing business analysis for a large bank: specifying data feeds into a billing system, and managing various migrations onto the new system.

But about a year ago, I heard that the bank was using Tableau for reporting, and so I threw my hat into the ring and developed a very rudimentary dashboard to show what might be possible to report on their business area. (My having now used the product for 18 months, I think they liked what they saw.)

And the rest is history. Nowadays, much of the time I spend with the bank is focused on developing and enhancing dashboards, introducing new elements, pushing the boundaries (at least my boundaries) of the software. I find it thoroughly rewarding. And I continue to use Tableau daily on the Telecoms contract, enhancing the existing offering andintroducing new stuff. Having recently implemented Alteryx there, I’m keen to use this to enhance the way in which data is presented to Tableau to make things more efficient and to take the offering to the next level.

For the last 27 years (yes, literally), whenever I’ve received some data, my automatic instinct has been to open my spreadsheet package of choice and play: Lotus 1-2-3 for five years, then a short period of Quattro Pro, and 21 years of Excel. Nowadays, when I receive some structured data, I invariably throw it into Tableau and try to make it sing.

My view is that working in Tableau with two clients – operating in completely different fields – makes me better. I encounter problems producing something for the bank that I learn how to solve, and these then benefit my telecoms client; and vice versa.

As for Tableau, I adore the software. I adore the company and its ethos. And I adore its community. And long may I continue to use it.