flowchart LR
A[Input Step\nConnect to source] --> B[Clean Step\nFix, rename, filter]
B --> C[Aggregate Step\nReduce granularity]
C --> D[Join/Union Step\nCombine tables]
D --> E[Pivot Step\nReshape wide/long]
E --> F[Output Step\nSave as .hyper or published source]
style A fill:#e3f2fd,stroke:#1976D2
style B fill:#fff9c4,stroke:#F9A825
style C fill:#f3e5f5,stroke:#7B1FA2
style D fill:#fce4ec,stroke:#C62828
style E fill:#e0f7fa,stroke:#0097A7
style F fill:#e8f5e9,stroke:#388E3C
15 Introduction to Tableau Prep
Data preparation, cleaning, reshaping, and combining raw data into an analysis-ready form, typically consumes 60–80% of an analyst’s time. Tableau Prep Builder is the dedicated data preparation tool in the Tableau ecosystem, providing a visual, code-free environment for transforming messy source data into clean, structured datasets ready for Tableau Desktop. In this chapter, you will learn the Tableau Prep interface, how to connect to data sources in Prep, how to understand and navigate the Flow canvas, and how to perform the most fundamental transformation operations. Chapters 16 and 17 will extend these skills into advanced cleaning, combining, and sharing workflows.
15.1 What Is Tableau Prep Builder?
Tableau Prep Builder is a standalone application (included in the Creator licence) designed specifically for data wrangling. Unlike Tableau Desktop, which is optimised for analysis and visualisation, Prep is optimised for transformation: cleaning dirty data, reshaping tables, combining multiple data sources, and outputting clean datasets.
Why Prep instead of just fixing the data in SQL or Excel? - Transparency, The flow canvas shows every transformation visually, making the cleaning logic auditable and reproducible. - No code required, Business analysts without SQL skills can perform complex transformations. - Automation, Flows can be scheduled to run automatically on Tableau Server via Prep Conductor. - Integration, Output flows directly to .hyper extracts or published data sources, seamlessly feeding Tableau Desktop. - Data profiling, Prep shows a real-time data profile (value distributions, null counts, and anomaly flags) for every field at every step, which SQL and Excel do not.
| Task | Use Tableau Prep | Use Tableau Desktop |
|---|---|---|
| Fix misspelled values | ✓ | , |
| Remove duplicate rows | ✓ | , |
| Join multiple source tables | ✓ (complex joins) | ✓ (simple joins) |
| Pivot wide to long | ✓ | ✓ (limited) |
| Aggregate to reduce granularity | ✓ | , |
| Build calculated fields | ✓ (basic) | ✓ (full LOD/Table Calc) |
| Create visualisations | , | ✓ |
| Build dashboards | , | ✓ |
| Schedule data refresh | ✓ (via Prep Conductor) | ✓ (via Server extract refresh) |
15.2 The Tableau Prep Interface
The Tableau Prep Builder interface has five main areas:
Connections panel (left), Lists all active data source connections. Click the “+” to add a new connection. Supports all the same connectors as Tableau Desktop.
Flow pane (centre, top half), The visual canvas showing the complete data preparation flow as a series of connected step icons. Each step represents one transformation operation. Steps are connected by arrows showing the direction of data flow.
Profile pane (centre, bottom half), Appears when a step is selected. Shows a data profile for the output of that step: a bar chart distribution for every field, null counts, and the first few rows of data.
Data grid (bottom), Shows a sample of the actual row-level data at the selected step. Allows you to verify transformations at the row level.
Changes panel (right), Lists every transformation applied in the selected step, in order. Each change can be undone or re-ordered independently.
[Insert screenshot of the full Tableau Prep Builder interface with all five areas labelled: Connections, Flow Pane, Profile Pane, Data Grid, and Changes Panel]
15.3 Input Steps: Connecting to Data
Every Tableau Prep flow begins with one or more Input steps. An Input step connects to a data source and brings data into the flow.
Adding an Input step: 1. In the Connections panel, click the “+” button. 2. Select the data source type (Microsoft Excel, Text File, Tableau Server, SQL Server, etc.). 3. Navigate to the file or enter connection credentials. 4. For Excel: select the specific worksheet. The Input step appears in the Flow pane. 5. Click the Input step to view its data profile in the Profile pane. 6. In the Input step settings (right panel): optionally set an incremental refresh field, sample the data (for large sources), or apply input-level filters.
[Insert screenshot of a Tableau Prep flow with a single Input step showing the Excel icon, and the Profile pane displaying the field distributions for each column]
The data profile is the most valuable feature of Tableau Prep for data quality assessment. For each field, the profile shows:
- Bar chart distribution, For categorical fields, shows the frequency of each value. Immediately reveals misspellings, unexpected values, and data quality issues.
- Histogram, For numeric fields, shows the distribution of values. Reveals outliers and unexpected ranges.
- Null indicators, A grey bar at the bottom of each field’s profile shows the percentage of null values.
- Value count, The number of distinct values in the field.
Data quality red flags visible in the profile: - A categorical field with values like “East”, “east”, “EAST” (case inconsistency, will be treated as three separate values). - A date field with some values displayed as text strings (mixed data type). - A numeric field with a very long tail (extreme outliers). - A field with >5% null values (missing data problem).
15.4 Step Types: The Transformation Toolkit
Tableau Prep provides six primary step types, each performing a different category of transformation:
| Step Type | Icon Colour | Primary Operations |
|---|---|---|
| Input | Orange | Connect to a data source |
| Clean | Green | Rename fields, fix values, filter rows, split fields, add calculated fields |
| Aggregate | Blue | Group and aggregate data to reduce granularity |
| Join | Blue (two circles) | Combine two branches of the flow on a key field |
| Union | Blue (stacked) | Stack two tables with the same structure vertically |
| Pivot | Blue (arrows) | Reshape between wide and long format |
| Output | Green | Save the result as a .hyper file or published data source |
Add any step by clicking the “+” button that appears when you hover over any step in the flow, then selecting the step type from the dropdown menu.
Goal: Clean and output the Superstore Orders data.
- Open Tableau Prep Builder and connect to Sample - Superstore.xls.
- Drag the
Orderssheet to the Flow pane, an Input step appears. - Click the “+” on the Input step and select Add Step (Clean).
- A Clean step appears connected to the Input. Click it to see the profile.
- In the Profile pane, click the
Ship Modefield. Observe its value distribution. - If you see inconsistent values (e.g., “Standard Class” and “standard class”), click the incorrect value in the profile bar and select Group and Replace to standardise it.
- Click the “+” on the Clean step and select Output.
- In the Output step settings, choose the output type: Save to file (
.hyper) or Publish as a data source to Tableau Server. - Click Run Flow. Tableau Prep processes all steps and produces the clean output.
[Insert screenshot of a three-step flow: Input (orange) → Clean (green) → Output (green), with the Clean step selected and its profile showing the Ship Mode field distribution]
15.5 Clean Step Operations
Rename a field: Double-click the field name in the profile pane header and type the new name.
Change data type: Click the data type icon above a field (Abc, #, calendar) to change it.
Filter rows: Click the filter icon on a field header and set inclusion/exclusion conditions.
Remove a field: Click the “×” on a field header to exclude it from downstream steps (the field is hidden, not deleted from the source).
Fix values, Group and Replace: 1. Click a field in the profile pane to select it. 2. Click the Group Values icon (lozenge shapes) in the profile header. 3. Choose a grouping method: Pronunciation (groups phonetically similar values), Common Characters (groups values sharing the most characters), Spelling (groups by edit distance), or Manual. 4. Review the suggested groupings and accept or modify them. 5. All variant spellings (e.g., “N. America”, “North America”, “NORTH AMERICA”) are mapped to a single canonical value.
[Insert screenshot of the Group and Replace dialog showing three variant spellings of a region being merged into one canonical value]
The Pronunciation grouping method in Tableau Prep uses phonetic algorithms to identify values that sound the same but are spelled differently, especially useful for company names and person names entered by different operators. For example, “McDonalds”, “McDonald’s”, and “Mc Donalds” would all be grouped together by pronunciation. This is one of the most powerful and time-saving cleaning features in the entire Tableau ecosystem.
15.6 Aggregate Steps
An Aggregate step reduces the granularity of your data, it groups rows and computes summary statistics, similar to a SQL GROUP BY operation. Use an Aggregate step when:
- Your source data is at a lower granularity than required for analysis (e.g., transaction-level data, but you only need monthly totals).
- You want to reduce the size of the output extract for performance reasons.
- You are preparing data for a join where the key table must be at a higher granularity.
Configuring an Aggregate step: 1. Add an Aggregate step after a Clean step. 2. In the Aggregate step settings, the left panel shows all available fields. Drag fields to either Grouped Fields (dimensions) or Aggregated Fields (measures). 3. For each aggregated measure, choose the aggregation: SUM, AVG, MIN, MAX, COUNT, COUNTD. 4. Fields not included in either panel are excluded from the output. 5. The Profile pane immediately updates to show the reduced-granularity data.
15.7 Output Steps and Scheduling
Every Tableau Prep flow must end with an Output step. Output options:
| Output Type | Description | Use Case |
|---|---|---|
| Save to file (.hyper) | Saves a Tableau extract to a local or network path | Local use; share as part of a .twbx |
| Publish as data source | Publishes to Tableau Server or Cloud as a reusable certified source | Enterprise sharing and governance |
| Save to database | Writes results back to a supported database table | ETL pipelines and data warehouse loading |
| Save to CSV | Exports to a plain text file | Sharing with non-Tableau users |
Scheduling flows: On Tableau Server with the Data Management Add-on (Prep Conductor), flows can be scheduled to run automatically. Go to Publish in Tableau Prep Builder and set a refresh schedule. Prep Conductor monitors flow health and sends alerts for failures.
15.8 Summary
| Component | Description | Key Interface |
|---|---|---|
| Flow canvas | Visual representation of all transformation steps | Centre of Prep Builder window |
| Input step | Connects to a data source | “+” in Connections panel |
| Profile pane | Data quality distribution for every field | Appears when a step is selected |
| Clean step | Rename, fix, filter, calculate | “+” > Add Step > Clean |
| Aggregate step | Group and summarise to reduce granularity | “+” > Aggregate |
| Group and Replace | Fix misspellings and variant values | Profile field header > Group Values icon |
| Output step | Save or publish cleaned data | “+” > Output |
| Prep Conductor | Scheduled flow execution on Tableau Server | Server admin interface |
The single most valuable habit in Tableau Prep is to inspect the data profile immediately after adding each Input step, before performing any transformations. Spend 5 minutes reviewing every field’s distribution and null count. This upfront investment consistently reveals data quality issues (misspellings, wrong data types, unexpected nulls, outlier values) that would otherwise corrupt your analysis downstream and be very difficult to diagnose in Tableau Desktop. Clean data is the foundation of trustworthy analysis.