Typical Analytical

Tableau + A Little SQL Goes a Long Way

If you are not taking advantage of the ability to manipulate data outside of Tableau, you might be missing out on opportunities you never thought possible. And you might be surprised by how many things you can do with SQL and, especially, by how easy some common tasks are to do with SQL.

And with powerful and reliable open source solutions, like PostgreSQL, there are no barriers to get up and running with the power of SQL.

In minutes, you can be using SQL to assemble, aggregate, reshape and pre-process data outside of Tableau.

Why would you want to do this?

The simplest answer is that some things are easier and more efficient to do outside of Tableau.  The best way to explain it, though, is likely through some examples.

This visualization is a general example of something that is much easier to do when some of the calculations are performed outside Tableau:

Another example is something I do quite a bit of at work and that is multi-pass aggregation.  If you’ve ever received the “…cannot be further aggregated” error in Tableau, you know what I mean.  In addition to using table calculations and export/re-import workarounds, pre-aggregating using SQL is often the best solution to this limitation and one I use every day at work.

Another example is pre-processing string operations which can be brutal on Tableau performance.  Here is SQL code I used to pre-calculate instructional text for my Exploring Sci-Fi Movies in the IMDB 5000 Dataset application:

update movie set dir_mov_dyn_title = ‘MOVIES ‘ || upper_director_name_only || ‘ DIRECTED. CLICK ON MOVIE NAME TO OPEN IMDB. USE WATCH LIST IN IMDB:’

commit

I also used the following code to avoid having to use the costly CONTAINS function in a Tableau calculation to flag whether a movie was a science fiction movie:

create table movie_sci_fi_flag as
(
select movie.*, case when genres like ‘%Sci-Fi%’ then 1 else 0 end as sci_fi_flag2
from movie
)

Not so scary.  Right?

This one might be a little scarier but, to be honest, I just googled it and adjusted for my purposes:

— create a column to sort by in Tableau (use as leftmost column and delect Show Header); again to lighten the load in Tableau

create table movie as (select row_number() over (order by imdb_score desc) as rnum, moviebak.* from moviebak order by imdb_score desc)

My point is that SQL can dramatically expand what you are able to do in Tableau and how efficiently you are able to do it.  I’ll leave you some more code examples to give you a sense for how some basic and common things are done to hopefully impress upon you that SQL isn’t as scary as you think it is.  Go for it!!!!!!!!!

— create a table

create table movie (
director_name varchar(100), genres varchar(250), actor_1_name varchar(100), movie_title varchar(250), movie_imbd_link varchar(100), content_rating varchar(25), title_year varchar(4), imdb_score real)

— copy a table

create table moviebak as (select * from movie)

— delete a table

drop table movie

— populate table from csv

copy movie from ‘C:\Users\TypicalAnalytical\Downloads\movie_metadata.csv-movie_metadata (1) (subset of columns).csv’ delimiter ‘,’ csv header;

— export table to csv

copy movie to ‘C:\Users\TypicalAnalytical\Downloads\movie_postgresql.csv’ delimiter ‘,’ csv header;

— add a column

alter table movie add column dir_mov_dyn_title varchar(200)

alter table movie add column upper_director_name_only varchar(100)

— modify a column

alter table movie rename movie_imbd_link to movie_imdb_link

ALTER TABLE movie
ALTER COLUMN upper_director_name TYPE varchar(50);

— example one-shot string operations you can do in back end use to lighten load on Tableau

update movie set upper_director_name_only = upper(director_name)

commit

update movie set dir_mov_dyn_title = ‘MOVIES ‘ || upper_director_name_only || ‘ DIRECTED. CLICK ON MOVIE NAME TO OPEN IMDB. USE WATCH LIST IN IMDB:’

commit

update movie set upper_director_name_director = upper_director_name_only || ‘ (DIRECTOR)’

commit

— create a column to sort by in Tableau (use as leftmost column and delect Show Header); again to lighten the load in Tableau

create table movie as (select row_number() over (order by imdb_score desc) as rnum, moviebak.* from moviebak order by imdb_score desc)

— create a flag so that Tableau doesn’t need to perform string operations, like contains in the following example

alter table movie add column sci_fi_flag boolean

create table movie_sci_fi_flag as
(
select movie.*, case when genres like ‘%Sci-Fi%’ then 1 else 0 end as sci_fi_flag2
from movie
)

— create a table with a full list of individual words used in each beer name for use in word cloud, for example…

create table word_cloud as
(
SELECT regexp_split_to_table(word, ‘\s’) as new_word
FROM (
select replace (beer, ‘*’, ”) as word
from beer
) a
)