Battle of the Tipsters: Post-match analysis

I recently filled out a form putting myself forward to participate in the Battle of the Tipsters at April’s #LondonTUG. I wasn’t too worried. After all, I didn’t expect to be selected.

Imagine my surprise a couple of weeks later when David Pires asked me whether I’d like to take up the challenge. I’d filled out a form. How could I now say no?

That’s when the fear kicked in. I’d be up against Lorna Eden. She of Data School fame, herself a champion, not once but twice!

Now I love faffing around in Tableau as much as the next person, assuming I’m sat next to someone who loves faffing around in Tableau. But now I had to come up with five tips. Five tips that stood a chance of beating Lorna’s. But also five tips that would pass muster when presented to a friendly crowd of Tableau Rockstars, future and present.

Were I to lose, I couldn’t go out with a whimper.

So I got my thinking cap on. And I came up with a list. It wasn’t a shortlist, as that would suggest I had drawers full of tips ready to bring out at the drop of a hat. No, it was simply a list. A list of things that most people would know how to do, but that hopefully few people would have thought of doing.

I put some life to my tips by giving them musical names:

  • S’Express: Making your tooltips grammatically correct with a “pluraliser” field
  • Goodie Two Choose: Limiting the selections available within a field
  • Let’s Do The Time Warp Again: Letting the user select the time unit used for reporting
  • Break On Through (To The Other Side): Sorting a field but forcing the “Other” category to the end
  • Total Eclipse of the Chart: Putting summary totals onto a stacked bar chart.

(Videos of each tip can be accessed at the bottom of this post.)

I even created my own UK version of the Superstore data, featuring Asda, Lidl and Waitrose, to make the tips a little more accessible to my local audience.

Then the day itself came. And I was a little nervous, but I quelled the nerves with a beer downstairs beforehand with Ravi Mistry and Mark Edwards.

And then the fun began. And I have to say, the experience was somewhat of a blur. And not because of the beer. For me, presenting the tips required quite deep levels of concentration. At home you can do them in your sleep, and a slip-up is unimportant. Presenting to the London TUG, my main worry was that a slip-up would unravel the tip, making me hit the 1m30s time limit with nothing to show but groans and regrets. My step-by-step instructions sheet was my lifeline.

Lorna’s tips were impressive, but were similarly blurry, as I composed myself for the next round. After each pair of tips, there was audience participation: a cheer and a whoop for each of us to determine that round’s victor. As I was positioned off to one side of the room, it was difficult to tell who won each round, so I had no idea how we stood come the end. As I understand it after the event, it was pretty much a draw and it came down to a tweet-off, which Lorna narrowly won. That said, I think I remember speaking to Liam Butcher from Bose over a drink after the event. Perhaps he could commission some more calibrated and accurate whoop-analysis?

All in all, it was great fun and I pushed my comfort zone. There’s even talk of me figuratively wearing Lorna’s crown to the next London TUG as she may be retiring undefeated. To do that, I’d need five more tips. Yikes!

 

 

 

 

Allowing the user to select what to report

Of all of the things I enjoy doing in Tableau, perhaps my favourite is allowing the user to choose what information they want to see.

Do they want to see a report showing the number of sales? Or do they want to see the total value of those sales? Do they want to see the data listed by salesperson or by customer or by region?

This is where parameters are your friend. Perhaps even your BFF.

The solution involves two steps. First, you need to create a parameter that allows the user to select their field of choice.

So first, the parameter. Create a string parameter that allows the user to select which parameter they want to see in the view.

Parameter

Here, I’ve create a parameter that allows the values “Number of invoices” or “Value of invoices”.

Next, I need to create a calculated field within my data set that uses that parameter to determine its value.

Equation

 

So this is simply using a CASE statement against our new parameter to figure out how to populate the new field “Reporting field”.

Now, you just need to use the new field in your views and make the parameter selectable by the user. When the user changes the parameter, your views will update automatically.

An identical approach can be used for alphanumeric fields. Instead of populating with the reporting field with the number of invoices or the invoice value, you could choose to make it a string field and populate it with either region or salesperson. And you’d change the permitted values of your parameter accordingly.

One limitation of this approach is that the display formatting of the two fields cannot be different. For example, I can’t have the number of invoices display as a number (1,234) and the value of the invoices display as a high currency amount (e.g. £1,234m).

You can do this in labels and tooltips via an additional calculated field:

IF [What to report] = “Value of invoices” THEN “£” ELSE “” END

Add this new field into your label or tooltip immediately after the “Reporting field” value, and it’ll slap an “m” after the number whenever the invoice value is chosen by the user.

To get the axes to be formatted differently based on the user’s selection, you instead need two completely separate worksheets, identical apart from the value that they report. A Display toggle field will then allow you to choose which one to show in your dashboard.

More on that another approach time.

Sorting by a field in a secondary data set

One of my biggest frustrations with Tableau is the limited facility that comes when linking to a secondary data set. One of these is the inability to sort fields from the secondary data source.

Maybe I have my invoiced amounts in my primary data source by relationship reference, and a secondary data source that tells me the salesperson for each relationship reference.

And I want to report the total amount invoiced by salesperson.

I can do that quite simply. I drag the invoice amount into the body of the view from the primary data source. I then flip to the secondary data source, click the chain link to make the data sets linked, and drag the salesperson into my rows.

Now if salesperson had been in the primary data source, the sort buttons in the toolbar would be clickable. Or if you clicked the dropdown arrow on the Salesperson pill in the rows, a Sort option would have been available. We could have sorted the view by amount, showing the salesperson with the highest total invoice amount at the top.

But because the field has been brought from the secondary data set, we’re presented with this.

Grey sort

And this.

Salesperson

And so we can’t sort.

Although perhaps we can, with a little hack.

If you want to sort in descending order, then create a calculated field in your primary data source that which is simply the negative of the field being reported

Equation

Now drag that field into the rows next to your Salesperson. As it is a measure, it’ll be treated as being continuous and will be green, and will only be allowed to the right of Salesperson. And because it’s coming from a different data source, it’ll appear as an aggregate field.

Now click the new pill’s dropdown and change it to discrete. It’ll turn blue, and you can now reorder the two row fields so that the aggregate variable appears first.

Rows

And now, because you have a field from the primary data source in your rows, you can click the dropdown on the AGG([Inv Amt (neg)]) field and you’ll see a Sort menu option.

Click this and you’re presented with only two options. You can either sort by the data source or you can manually configure the sort order based on the values you see. Most of the time you’ll want to sort by data source. (It only allows low to high sorting here, not high to low. Hence our need to create a negative version of our variable.)

So by sorting in ascending order by a negative field, it’s the equivalent of sorting a positive field in descending order.

Your first column headings will be the negative versions of the values in the table. Now you’ll probably want to hide this first column, so you can simply make the text white and narrow the column to its narrowest possible width.

And there you have it.

Note: if there’s any likelihood that two values in the table will be the same, you’ll need to make them subtly different. I often do this by adding a very small but variable to the calculated field that you just created. I often do this by finding a couple of numbers that are generally quite large and fluid in the data sets and dividing 1 by their product:

1/([Sales volume] * [Outstanding amount]

If you add this to the calculated field that you calculated earlier, then the two figures will be different by a fraction of a penny, and they’ll appear as separate ungrouped items in the table.

The GCloud Dashboard

Far and away my most popular dashboard is my GCloud dashboard, with over 4,000 clicks to date. It’s gone through various iterations over the last 18 months or so, but its primary offering has pretty much settled down into something that I think is flexible and summarises quite well the underlying data. It can be found here.

The GCloud framework was set up by government in 2012 as a way of disaggregating its IT provision. Its intent was to open up provision to a wider set of small and medium-sized enterprises, and become less reliant on the traditional big IT suppliers.

And as its name suggests, its intent was to bring about a much more cloud-based approach to IT provision.

The underlying data is published monthly (usually on around the 20-something of the month for the prior month) and can be found here.

The dashboard itself uses the feature that allows the user to choose their time unit for reporting. Otherwise, it’s a relatively basic presentation of the data. Perhaps it could do with another overhaul. But even so, I think it shows how GCloud has evolved, and gives the user the flexibility to drill down on any aspect.

My friend Alan posted earlier today his interpretation of the data here, which is well worth a read. It’s lovely to see someone presenting commentary based on cuts of data drawn from your own dashboard.

DATETRUNC by decade

I recently wrote a post describing how to allow users to decide the level at which their time-based data was displayed: year, quarter, month, week, day etc.

The solution uses the DATETRUNC function. However one weakness of this is that it does not allow decade-based reporting. The DATETRUNC function only goes as high as the YEAR unit.

This evening, I had a desire to allow the user to report at either year or decade level. So I set out to tackle the problem.

In the parameter field (described in the aforementioned post), I added the ability to select Decade. And I used the following formula to define the reporting field:

if [Date unit] = “decade” then str(year([Date])-(year([Date]) % 10))+”s”
elseif [Date unit] = “year” then str(year([Date]))
end

The initial clause is invoked when when Decade is selected. The last part, featuring the % symbol, is the equivalent of the MOD function in Excel. It divides the year of the date by 10 and brings back the remainder.

So if you have a date in the year 1987, it’ll do the calculation 1987 – 7, bringing back 1980. The “s” is strung onto the end to make “1980s.

If you need the field to appear as and be treated as a date, then things get a little more tricky. Unfortunately, you can’t use the DATETRUNC feature against the parameter, as whenever Decade is selected, it’ll break, even if that clause isn’t being invoked. (Rather annoying.) And so you’ll need to use an IF THEN ELSEIF to populate based on each of the parameter’s values:

IF [parameter] = “Week” then DATETRUNC(‘week’,[Date])

ELSEIF [parameter] = “Month” then DATETRUNC(‘month’,[Date])

ELSEIF [parameter] = “Year” then DATETRUNC(‘year’,[Date])

ELSEIF [parameter] = “Decade” then date(“01/01/”+str(year([Date])-(year([Date]) % 10)))

END

Not the most elegant. But it should do the trick.