flowchart TD
A[Data Fundamentals] --> B[Number of Records\nRow count measure]
A --> C[Database Joins\nCombining tables]
A --> D[Cross-tabulation\nText tables]
C --> C1[Inner Join\nMatching rows only]
C --> C2[Left Join\nAll left + matching right]
C --> C3[Right Join\nAll right + matching left]
C --> C4[Full Outer Join\nAll rows from both]
D --> D1[Grand Totals\nSubtotals]
D --> D2[Conditional\nFormatting]
style A fill:#e3f2fd,stroke:#1976D2
style C fill:#fff9c4,stroke:#F9A825
style D fill:#e8f5e9,stroke:#388E3C
14 Number of Records and Measures, Joining Database, and Cross-tabulation
Three foundational data topics are covered in this chapter. First, you will learn the distinction between Number of Records (a row count metric) and other measures, and when each is appropriate. Second, you will master database joining in Tableau, inner joins, left joins, right joins, full outer joins, and cross joins, and understand the data quality implications of each. Third, you will learn to build cross-tabulations (text tables) in Tableau, including how to add subtotals, grand totals, and conditional formatting to transform a raw data grid into a publication-ready analytical table.
14.1 Number of Records
Number of Records is a special auto-generated measure that Tableau creates for every data source. It always returns 1 for every row in the dataset, so SUM([Number of Records]) equals the total row count in the current view. It is Tableau’s built-in equivalent of COUNT(*) in SQL.
When to use Number of Records vs. COUNTD:
| Metric | Formula | What It Counts |
|---|---|---|
SUM([Number of Records]) |
1 per row | Total rows in the table (including duplicates) |
COUNT([Order ID]) |
1 per non-null Order ID | Rows with a non-null value in the field |
COUNTD([Order ID]) |
1 per unique Order ID | Distinct orders (deduplicates) |
COUNTD([Customer ID]) |
1 per unique Customer ID | Distinct customers |
Key distinction: If an order spans three rows in the data (e.g., three products in one order), SUM([Number of Records]) returns 3 (row count), while COUNTD([Order ID]) returns 1 (distinct order count). Always think carefully about which is the correct denominator for your business question.
Order volume charts, A bar chart of SUM(Number of Records) by month shows how many order rows were placed each month. Combined with a line for SUM(Sales), this creates a volume + revenue dual-axis chart.
Data quality checks, After a join, comparing the expected row count to the actual SUM(Number of Records) reveals whether the join has created row duplications (fan trap) or unexpected data loss.
Density maps, On a geographic map, SUM(Number of Records) sized as a circle shows where the most transactions occurred.
As a denominator in rates, Calculating the average order value per region:
SUM([Sales]) / SUM([Number of Records])gives the average sales per row (be careful: if the row granularity is order-line, this is average revenue per line item, not per order).
SUM([Number of Records]) and COUNT([Order ID]) often return the same result but for different reasons. Number of Records counts all rows including those with null values in any field. COUNT([Order ID]) excludes rows where Order ID is null. In a clean dataset these match; in a real-world dataset with missing values, they may differ. Always verify which is appropriate for your specific analysis.
14.2 Database Joining in Tableau
A join combines rows from two or more tables based on a common key field (the join condition). The type of join determines which rows are included when the key values in the two tables do not match.
The four standard join types:
| Join Type | Result | Use Case |
|---|---|---|
| Inner Join | Only rows where the key matches in both tables | Use when you only want records that have matches on both sides |
| Left Join | All rows from the left table; matched rows from the right (NULLs for non-matches) | Use when the left table is the master and you want to keep all its records |
| Right Join | All rows from the right table; matched rows from the left | Equivalent to a Left Join with tables swapped |
| Full Outer Join | All rows from both tables; NULLs where there is no match | Use when you want to see all records from both tables, including unmatched ones |
In Tableau, joins are configured on the Data Source Page by dragging tables onto the canvas. The join type is selected by clicking the Venn diagram icon that appears between two joined tables.
- On the Data Source Page, drag the first table (e.g.,
Orders) to the canvas. - Drag a second table (e.g.,
Returns) to the canvas alongside the first. - Tableau automatically detects a common field and creates a join (defaulting to Inner Join). A Venn diagram appears between the two tables.
- Click the Venn diagram to open the Join dialog.
- Select the desired join type by clicking the appropriate Venn diagram option (Inner, Left, Right, Full Outer).
- Verify the join condition:
Orders.[Order ID] = Returns.[Order ID]. If Tableau selected the wrong field, click the join condition and change it. - Examine the data grid below: verify the row count is sensible and check for unexpected NULLs.
[Insert screenshot of the Data Source Page canvas showing two tables connected by a join, with the join dialog open and Left Join selected]
Scenario: Joining Orders (all orders) with Returns (only returned orders).
- Inner Join: Only returned orders appear. All non-returned orders are excluded. → Use when you only want to analyse returns.
-
Left Join: All orders appear. Returned orders show
Yesin the Returned field; non-returned orders showNULL. → Use when you want all orders with a flag for whether they were returned. This is the correct join for a “% of orders returned” analysis. - Full Outer Join: All orders from both tables appear. Useful if the Returns table might contain return records with no matching Order ID (data quality check scenario).
| Join Type | Orders Rows | Returns Rows | Result Rows | Non-returned orders? |
|---|---|---|---|---|
| Inner Join | 9,994 | 800 | 800 | No |
| Left Join | 9,994 | 800 | 9,994 | Yes (NULL in Returned) |
| Full Outer | 9,994 | 800 | 9,994 | Yes |
A “fan trap” occurs when joining a table to another that has multiple matching rows, causing the original row count to multiply. Example: joining an Orders table (one row per order) to an Order Lines table (multiple rows per order) on Order ID. The result has one row per order line, not one row per order, and any aggregation on the Orders table fields (like order-level discounts) will be summed multiple times per order.
Detection: After a join, check SUM(Number of Records). If it is higher than the expected row count of your primary table, you may have a fan trap. Solution: Use Tableau Relationships (the recommended approach for Tableau 2020.2+) instead of joins, or restructure the join to occur at the correct granularity.
14.3 Relationships vs. Joins
Tableau introduced the Relationships data model in version 2020.2 as the recommended replacement for most join use cases. Unlike joins, which merge tables into a single flat table upfront, relationships maintain separate tables and combine them only at query time, at the correct level of granularity for each table.
Key advantages of relationships over joins:
| Aspect | Join | Relationship |
|---|---|---|
| Row duplication risk | High (fan trap possible) | None (queries each table at its own granularity) |
| Aggregation accuracy | Can produce inflated totals | Always aggregates at the correct level |
| Performance | Pre-merges tables (can be slow) | Lazy evaluation, only queries needed tables |
| Setup | Requires specifying join type | Just drag tables and specify key |
Rule of thumb: Use Relationships when you have tables at different levels of granularity (orders + customers, orders + products). Use joins when you need a specific join type (e.g., full outer join to find unmatched records) or when working with a single, flat, pre-joined table.
14.4 Cross-tabulation (Text Tables)
A cross-tabulation (or crosstab, or text table) is a grid that displays measures at the intersection of two or more dimensions. It is the direct equivalent of a pivot table in Excel. Cross-tabs are appropriate when the audience needs precise numerical values, not just visual patterns, and when comparing many dimension combinations simultaneously.
In Tableau: A cross-tab is created by placing dimensions on both Rows and Columns, and a measure on the Text shelf of the Marks card (with mark type set to Text).
When to use cross-tabs: - Finance and accounting reports requiring exact figures. - Regulatory or compliance reporting with precise data. - As a supplementary view alongside a visual chart on a dashboard, providing the underlying numbers for stakeholders who want to verify the visual.
- Drag
CategoryandSub-Categoryto the Rows shelf (nested, Sub-Category inside Category). - Drag
Regionto the Columns shelf. - Drag
Salesto the Text shelf on the Marks card. Ensure the mark type is Text. - Press Ctrl+T (Windows) or use the keyboard shortcut to quickly create a text table from any selected fields.
- Format the Sales values: right-click
SUM(Sales)and select Format. Set Number format to Currency (0 decimal places), with thousands separator.
[Insert screenshot of a Tableau cross-tab showing Category > Sub-Category on Rows and Region on Columns, with Sales values formatted as currency]
- Go to Analysis > Totals.
- Select Show Row Grand Totals to add a total row at the bottom.
- Select Show Column Grand Totals to add a total column on the right.
- Select Add All Subtotals to add a subtotal row after each Category group (summing across Sub-Categories within each Category).
- To format totals differently from body cells: Format > Totals, set bold, different background colour, or larger font for the total rows.
[Insert screenshot of the same cross-tab with subtotals per Category and grand totals row/column, formatted with a bold dark background]
Tableau can apply colour to text table cells based on their values, creating a heat table or conditional format that combines the precision of a text table with the visual scanning power of a colour encoding.
- In the cross-tab view, drag a second copy of
Salesto the Colour shelf. - Click the Colour shelf and select a sequential palette (e.g., Blue-Teal: low sales = light, high sales = dark).
- To use a diverging palette (e.g., to highlight cells above/below average): create a calculated field:
Code
# Deviation from overall average sales
SUM([Sales]) - WINDOW_AVG(SUM([Sales]))- Drag this field to the Colour shelf and select an Orange-Blue diverging palette, centred at zero.
- Adjust transparency of the colour fill so the text labels remain legible.
[Insert screenshot of a conditionally formatted cross-tab where high-sales cells are dark blue and low-sales cells are light, with currency-formatted labels visible inside each cell]
14.5 Summary
| Concept | Description | Key Access |
|---|---|---|
| Number of Records | Auto-generated row count measure | Data pane (auto-generated) |
| COUNTD | Count of distinct values | Calculated field or aggregation in filter |
| Inner Join | Only matching rows from both tables | Data Source Page > Join dialog > Inner |
| Left Join | All left rows + matching right rows | Data Source Page > Join dialog > Left |
| Fan trap | Row duplication from one-to-many joins | Check SUM(Number of Records) after joining |
| Relationship | Granularity-safe multi-table model | Drag tables to canvas (no join type needed) |
| Cross-tab | Text table at intersections of dimensions | Mark type: Text + dimensions on Rows/Columns |
| Totals | Row/column grand totals and subtotals | Analysis > Totals |
| Conditional format | Colour cells by value | Drag measure to Colour shelf in text view |
The most common data modelling mistake in Tableau is using a join where a relationship should be used. Any time you are joining tables that operate at different granularities, customers and orders, orders and order lines, products and sales, use Tableau Relationships instead of joins. You will avoid fan traps, maintain correct aggregations, and produce more reliable analysis. Reserve joins for the specific cases where you need a full outer join or a cross join, scenarios that relationships do not support.