15  Introduction to Tableau Prep

NoteWhat This Chapter Covers

Data preparation, cleaning, reshaping, and combining raw data into an analysis-ready form, typically consumes 60–80% of an analyst’s time. Tableau Prep Builder is the dedicated data preparation tool in the Tableau ecosystem, providing a visual, code-free environment for transforming messy source data into clean, structured datasets ready for Tableau Desktop. In this chapter, you will learn the Tableau Prep interface, how to connect to data sources in Prep, how to understand and navigate the Flow canvas, and how to perform the most fundamental transformation operations. Chapters 16 and 17 will extend these skills into advanced cleaning, combining, and sharing workflows.

flowchart LR
    A[Input Step\nConnect to source] --> B[Clean Step\nFix, rename, filter]
    B --> C[Aggregate Step\nReduce granularity]
    C --> D[Join/Union Step\nCombine tables]
    D --> E[Pivot Step\nReshape wide/long]
    E --> F[Output Step\nSave as .hyper or published source]
    style A fill:#e3f2fd,stroke:#1976D2
    style B fill:#fff9c4,stroke:#F9A825
    style C fill:#f3e5f5,stroke:#7B1FA2
    style D fill:#fce4ec,stroke:#C62828
    style E fill:#e0f7fa,stroke:#0097A7
    style F fill:#e8f5e9,stroke:#388E3C


15.1 What Is Tableau Prep Builder?

NoteTableau Prep Builder: Visual Data Preparation

Tableau Prep Builder is a standalone application (included in the Creator licence) designed specifically for data wrangling. Unlike Tableau Desktop, which is optimised for analysis and visualisation, Prep is optimised for transformation: cleaning dirty data, reshaping tables, combining multiple data sources, and outputting clean datasets.

Why Prep instead of just fixing the data in SQL or Excel? - Transparency, The flow canvas shows every transformation visually, making the cleaning logic auditable and reproducible. - No code required, Business analysts without SQL skills can perform complex transformations. - Automation, Flows can be scheduled to run automatically on Tableau Server via Prep Conductor. - Integration, Output flows directly to .hyper extracts or published data sources, seamlessly feeding Tableau Desktop. - Data profiling, Prep shows a real-time data profile (value distributions, null counts, and anomaly flags) for every field at every step, which SQL and Excel do not.

NoteTableau Prep vs. Tableau Desktop: Which Tool for Which Task?
Task Use Tableau Prep Use Tableau Desktop
Fix misspelled values ,
Remove duplicate rows ,
Join multiple source tables ✓ (complex joins) ✓ (simple joins)
Pivot wide to long ✓ (limited)
Aggregate to reduce granularity ,
Build calculated fields ✓ (basic) ✓ (full LOD/Table Calc)
Create visualisations ,
Build dashboards ,
Schedule data refresh ✓ (via Prep Conductor) ✓ (via Server extract refresh)

15.2 The Tableau Prep Interface

NoteKey Interface Areas

The Tableau Prep Builder interface has five main areas:

  1. Connections panel (left), Lists all active data source connections. Click the “+” to add a new connection. Supports all the same connectors as Tableau Desktop.

  2. Flow pane (centre, top half), The visual canvas showing the complete data preparation flow as a series of connected step icons. Each step represents one transformation operation. Steps are connected by arrows showing the direction of data flow.

  3. Profile pane (centre, bottom half), Appears when a step is selected. Shows a data profile for the output of that step: a bar chart distribution for every field, null counts, and the first few rows of data.

  4. Data grid (bottom), Shows a sample of the actual row-level data at the selected step. Allows you to verify transformations at the row level.

  5. Changes panel (right), Lists every transformation applied in the selected step, in order. Each change can be undone or re-ordered independently.

[Insert screenshot of the full Tableau Prep Builder interface with all five areas labelled: Connections, Flow Pane, Profile Pane, Data Grid, and Changes Panel]


15.3 Input Steps: Connecting to Data

NoteAdding an Input Step

Every Tableau Prep flow begins with one or more Input steps. An Input step connects to a data source and brings data into the flow.

Adding an Input step: 1. In the Connections panel, click the “+” button. 2. Select the data source type (Microsoft Excel, Text File, Tableau Server, SQL Server, etc.). 3. Navigate to the file or enter connection credentials. 4. For Excel: select the specific worksheet. The Input step appears in the Flow pane. 5. Click the Input step to view its data profile in the Profile pane. 6. In the Input step settings (right panel): optionally set an incremental refresh field, sample the data (for large sources), or apply input-level filters.

[Insert screenshot of a Tableau Prep flow with a single Input step showing the Excel icon, and the Profile pane displaying the field distributions for each column]

NoteUnderstanding the Data Profile

The data profile is the most valuable feature of Tableau Prep for data quality assessment. For each field, the profile shows:

  • Bar chart distribution, For categorical fields, shows the frequency of each value. Immediately reveals misspellings, unexpected values, and data quality issues.
  • Histogram, For numeric fields, shows the distribution of values. Reveals outliers and unexpected ranges.
  • Null indicators, A grey bar at the bottom of each field’s profile shows the percentage of null values.
  • Value count, The number of distinct values in the field.

Data quality red flags visible in the profile: - A categorical field with values like “East”, “east”, “EAST” (case inconsistency, will be treated as three separate values). - A date field with some values displayed as text strings (mixed data type). - A numeric field with a very long tail (extreme outliers). - A field with >5% null values (missing data problem).


15.4 Step Types: The Transformation Toolkit

NoteThe Six Core Step Types

Tableau Prep provides six primary step types, each performing a different category of transformation:

Step Type Icon Colour Primary Operations
Input Orange Connect to a data source
Clean Green Rename fields, fix values, filter rows, split fields, add calculated fields
Aggregate Blue Group and aggregate data to reduce granularity
Join Blue (two circles) Combine two branches of the flow on a key field
Union Blue (stacked) Stack two tables with the same structure vertically
Pivot Blue (arrows) Reshape between wide and long format
Output Green Save the result as a .hyper file or published data source

Add any step by clicking the “+” button that appears when you hover over any step in the flow, then selecting the step type from the dropdown menu.

NoteHow To: Building Your First Tableau Prep Flow

Goal: Clean and output the Superstore Orders data.

  1. Open Tableau Prep Builder and connect to Sample - Superstore.xls.
  2. Drag the Orders sheet to the Flow pane, an Input step appears.
  3. Click the “+” on the Input step and select Add Step (Clean).
  4. A Clean step appears connected to the Input. Click it to see the profile.
  5. In the Profile pane, click the Ship Mode field. Observe its value distribution.
  6. If you see inconsistent values (e.g., “Standard Class” and “standard class”), click the incorrect value in the profile bar and select Group and Replace to standardise it.
  7. Click the “+” on the Clean step and select Output.
  8. In the Output step settings, choose the output type: Save to file (.hyper) or Publish as a data source to Tableau Server.
  9. Click Run Flow. Tableau Prep processes all steps and produces the clean output.

[Insert screenshot of a three-step flow: Input (orange) → Clean (green) → Output (green), with the Clean step selected and its profile showing the Ship Mode field distribution]


15.5 Clean Step Operations

NoteCore Cleaning Operations in Tableau Prep

Rename a field: Double-click the field name in the profile pane header and type the new name.

Change data type: Click the data type icon above a field (Abc, #, calendar) to change it.

Filter rows: Click the filter icon on a field header and set inclusion/exclusion conditions.

Remove a field: Click the “×” on a field header to exclude it from downstream steps (the field is hidden, not deleted from the source).

Fix values, Group and Replace: 1. Click a field in the profile pane to select it. 2. Click the Group Values icon (lozenge shapes) in the profile header. 3. Choose a grouping method: Pronunciation (groups phonetically similar values), Common Characters (groups values sharing the most characters), Spelling (groups by edit distance), or Manual. 4. Review the suggested groupings and accept or modify them. 5. All variant spellings (e.g., “N. America”, “North America”, “NORTH AMERICA”) are mapped to a single canonical value.

[Insert screenshot of the Group and Replace dialog showing three variant spellings of a region being merged into one canonical value]

TipUsing Pronunciation Grouping for Messy Data

The Pronunciation grouping method in Tableau Prep uses phonetic algorithms to identify values that sound the same but are spelled differently, especially useful for company names and person names entered by different operators. For example, “McDonalds”, “McDonald’s”, and “Mc Donalds” would all be grouped together by pronunciation. This is one of the most powerful and time-saving cleaning features in the entire Tableau ecosystem.


15.6 Aggregate Steps

NoteWhen to Use an Aggregate Step

An Aggregate step reduces the granularity of your data, it groups rows and computes summary statistics, similar to a SQL GROUP BY operation. Use an Aggregate step when:

  • Your source data is at a lower granularity than required for analysis (e.g., transaction-level data, but you only need monthly totals).
  • You want to reduce the size of the output extract for performance reasons.
  • You are preparing data for a join where the key table must be at a higher granularity.

Configuring an Aggregate step: 1. Add an Aggregate step after a Clean step. 2. In the Aggregate step settings, the left panel shows all available fields. Drag fields to either Grouped Fields (dimensions) or Aggregated Fields (measures). 3. For each aggregated measure, choose the aggregation: SUM, AVG, MIN, MAX, COUNT, COUNTD. 4. Fields not included in either panel are excluded from the output. 5. The Profile pane immediately updates to show the reduced-granularity data.


15.7 Output Steps and Scheduling

NoteSaving Flow Output

Every Tableau Prep flow must end with an Output step. Output options:

Output Type Description Use Case
Save to file (.hyper) Saves a Tableau extract to a local or network path Local use; share as part of a .twbx
Publish as data source Publishes to Tableau Server or Cloud as a reusable certified source Enterprise sharing and governance
Save to database Writes results back to a supported database table ETL pipelines and data warehouse loading
Save to CSV Exports to a plain text file Sharing with non-Tableau users

Scheduling flows: On Tableau Server with the Data Management Add-on (Prep Conductor), flows can be scheduled to run automatically. Go to Publish in Tableau Prep Builder and set a refresh schedule. Prep Conductor monitors flow health and sends alerts for failures.


15.8 Summary

NoteKey Concepts at a Glance
Component Description Key Interface
Flow canvas Visual representation of all transformation steps Centre of Prep Builder window
Input step Connects to a data source “+” in Connections panel
Profile pane Data quality distribution for every field Appears when a step is selected
Clean step Rename, fix, filter, calculate “+” > Add Step > Clean
Aggregate step Group and summarise to reduce granularity “+” > Aggregate
Group and Replace Fix misspellings and variant values Profile field header > Group Values icon
Output step Save or publish cleaned data “+” > Output
Prep Conductor Scheduled flow execution on Tableau Server Server admin interface
TipApplying This in Practice

The single most valuable habit in Tableau Prep is to inspect the data profile immediately after adding each Input step, before performing any transformations. Spend 5 minutes reviewing every field’s distribution and null count. This upfront investment consistently reveals data quality issues (misspellings, wrong data types, unexpected nulls, outlier values) that would otherwise corrupt your analysis downstream and be very difficult to diagnose in Tableau Desktop. Clean data is the foundation of trustworthy analysis.