6  Distribution Plot and Common Display Charts

NoteWhat This Chapter Covers

Understanding how data is distributed, not just its average, is one of the most important skills in data analysis. This chapter covers histograms, frequency polygons, density plots, cumulative distribution charts, waterfall charts, Gantt charts, and bullet charts. You will learn how each display reveals a different aspect of data shape and business performance, and you will build each chart type in Tableau using the Superstore dataset.

flowchart TD
    A[Distribution and Display Charts] --> B[Shape of Data]
    A --> C[Cumulative Patterns]
    A --> D[Change Decomposition]
    A --> E[Time and Duration]
    A --> F[KPI vs Target]
    B --> B1[Histogram]
    B --> B2[Density Plot]
    B --> B3[Frequency Polygon]
    C --> C1[Cumulative Distribution <br> Pareto Chart]
    D --> D1[Waterfall Chart]
    E --> E1[Gantt Chart]
    F --> F1[Bullet Chart]
    style A fill:#e3f2fd,stroke:#1976D2
    style B fill:#f3e5f5,stroke:#7B1FA2
    style C fill:#fff9c4,stroke:#F9A825
    style D fill:#e8f5e9,stroke:#388E3C
    style E fill:#fce4ec,stroke:#C62828
    style F fill:#e0f7fa,stroke:#0097A7


6.1 Histograms: The Primary Distribution Chart

NoteWhat a Histogram Shows

A histogram is the primary tool for visualising the distribution of a single continuous variable. It divides the range of values into equal-width intervals (called bins) and displays the count (or frequency) of observations that fall within each bin as a bar.

Unlike a bar chart, where each bar represents a distinct category, the bars in a histogram are adjacent (no gaps) because the underlying variable is continuous.

What a histogram reveals: - Centre, Where the distribution peaks (the mode). - Spread, How wide the distribution is. - Skewness, Whether the distribution has a long tail to the left (left-skewed) or right (right-skewed). - Modality, Whether there is one peak (unimodal), two peaks (bimodal), or more. - Outliers, Isolated bars far from the main body of the distribution.

NoteHow To: Creating a Histogram in Tableau
  1. Drag a continuous measure (e.g., Sales) to the Columns shelf.
  2. In the Show Me panel, click Histogram. Tableau automatically creates bins and draws the histogram.
  3. Alternatively, right-click Sales in the Data pane and select Create > Bins. Set the bin size (e.g., 100 for $100 intervals) and click OK. Drag the new Sales (bin) field to Columns and drag Number of Records to Rows.
  4. To adjust bin size: right-click the Sales (bin) field in the Data pane and select Edit. Change the bin size and observe how the shape changes.
  5. To add a normal distribution reference curve: open the Analytics pane and drag Distribution Band onto the histogram, selecting Normal Distribution.

[Insert screenshot of a Tableau histogram of Sales with bin size $500, showing a right-skewed distribution with most orders between $0 and $500]

NoteChoosing the Right Bin Size

Bin size is the most important parameter in a histogram. Too few bins (large bin size) over-smooths the distribution and hides important features. Too many bins (small bin size) creates a noisy, jagged chart that is hard to interpret.

Bin Size Effect Risk
Too large Over-smooth; hides shape Misses bimodal patterns
Just right Reveals shape clearly ,
Too small Noisy and irregular Spiky bars suggest randomness that may not exist

A practical rule: Start with the square root of the number of observations as your number of bins. For 1,000 observations, start with 32 bins and adjust from there.

TipHistograms vs. Bar Charts: The Key Distinction

Students frequently confuse histograms with bar charts. The critical distinction: bars in a histogram are adjacent (continuous variable, no gaps), while bars in a bar chart are separated by small gaps (categorical variable). In Tableau, the gap can be removed on a bar chart using Format > Borders > Row/Column Divider > None, but this does not make it a histogram, the conceptual difference lies in the nature of the underlying variable.


6.2 Frequency Polygons and Density Plots

NoteFrequency Polygons: Comparing Distributions with Lines

A frequency polygon is a line chart drawn through the midpoints of the tops of histogram bars. It conveys the same information as a histogram but uses less ink and makes it easier to overlay multiple distributions on the same chart (multiple bar charts overlaid become unreadable quickly, while multiple lines remain distinguishable).

When to use: When you want to compare the distribution of the same variable across two or more groups (e.g., comparing the order value distribution for the East vs. West region).

In Tableau, create a frequency polygon by: 1. Building a histogram as described above. 2. Changing the mark type from Bar to Line. 3. Adding a dimension (e.g., Region) to the Colour shelf to produce one line per group.

NoteDensity Plots: A Smoothed Histogram

A density plot (kernel density estimate, or KDE) is a smoothed version of a histogram that replaces the stepped bars with a continuous curve. It shows the probability density of the variable, the relative likelihood of observing a value at each point on the axis.

In Tableau, density plots are available through the Density mark type: 1. Place a continuous measure on Columns. 2. Change the mark type to Density. 3. Tableau renders a heatmap of the distribution. Adjust the colour intensity using the Colour shelf.

For a classical density curve, consider using the Analytics pane > Distribution Band with a Normal or non-parametric distribution setting, or generate KDE curves in Python and embed them as a custom data source.


6.3 Cumulative Distribution and Pareto Charts

NoteCumulative Distribution: How Values Accumulate

A cumulative distribution chart (CDF chart) plots the cumulative sum or percentage of a measure as the categories are ordered from smallest to largest. It answers questions like: “What percentage of customers generate 80% of our revenue?” or “At what sales threshold do we capture 50% of our orders?”

A Pareto chart is the most common business version of the CDF: it combines a bar chart (showing individual values in descending order) with a line chart (showing the cumulative percentage). It is the visual form of the Pareto principle, the observation that roughly 80% of effects come from 20% of causes.

NoteHow To: Building a Pareto Chart in Tableau
  1. Drag Sub-Category to the Columns shelf, sorted by Sales descending.
  2. Drag Sales to the Rows shelf for the bar chart component.
  3. To add the cumulative line: drag Sales to the Rows shelf a second time.
  4. Right-click the second SUM(Sales) pill and select Add Table Calculation > Running Total, then add a secondary calculation of Percent of Total.
  5. Right-click the second SUM(Sales) pill and select Dual Axis.
  6. Change the second axis mark type to Line.
  7. Synchronise the axes (right-click the right axis > Synchronise Axis). The right axis should be formatted as a percentage (0% to 100%).
  8. Add a reference line at 80% on the cumulative axis to visually identify the top sub-categories that contribute 80% of sales.

[Insert screenshot of a Tableau Pareto chart for Sub-Category Sales, with bars and a cumulative percentage line, and a reference line at 80%]

TipApplying Pareto Analysis in Business

The Pareto chart is one of the most frequently used tools in business analytics and quality management (Caldwell, 2021). In retail, it typically reveals that 3–5 product sub-categories generate 80% of revenue. This insight should drive decisions about inventory investment, marketing spend, and supplier negotiations. In customer analytics, the Pareto principle almost always shows that a small proportion of customers generate a disproportionate share of lifetime value.


6.4 Waterfall Charts: Decomposing Change

NoteWhat a Waterfall Chart Shows

A waterfall chart (also called a bridge chart or cascade chart) shows how a total value is built up or broken down through a series of positive and negative contributions. It is ideal for financial analysis, showing how net profit is derived from revenue after deducting each cost category, and for performance analysis, showing how actual results compare to budget through a series of variance drivers.

Reading a waterfall chart: - Bars that go up (positive) are coloured one colour (typically blue or green). - Bars that go down (negative) are coloured another colour (typically red or orange). - The first bar (starting total) and last bar (ending total) typically rest on the zero axis. - Intermediate bars “float” above or below their running total.

NoteHow To: Creating a Waterfall Chart in Tableau
  1. Set up your data with categories (e.g., Revenue, COGS, Operating Expenses, Net Profit) and their values (positive for revenue, negative for costs).
  2. Create a running total calculated field:
Code
# Waterfall running total
RUNNING_SUM(SUM([Value]))
  1. Create a Gantt bar size field equal to the individual value:
Code
# Gantt bar size for waterfall
SUM([Value])
  1. Drag Category to Columns and the running total field to Rows.
  2. Change the mark type to Gantt Bar.
  3. Drag the size field to the Size shelf.
  4. Drag the value field to the Colour shelf and create a calculated colour field (positive = blue, negative = red using an IF statement).

[Insert screenshot of a waterfall chart showing Revenue decomposed to Net Profit through COGS, Marketing, and Operating Expense bars]


6.5 Gantt Charts: Visualising Duration and Scheduling

NoteWhen to Use a Gantt Chart

A Gantt chart displays tasks or events as horizontal bars spanning their start and end dates on a time axis. It is the standard tool for project management visualizations, showing task duration, sequencing, and overlap at a glance.

In data analytics (beyond project management), Gantt charts are useful for: - Visualising customer order fulfilment time (from order date to ship date). - Showing the duration of customer subscription periods. - Displaying supply chain lead times by supplier.

NoteHow To: Creating a Gantt Chart in Tableau
  1. Drag Order Date to the Columns shelf (set to exact date, not aggregated).
  2. Drag Sub-Category to the Rows shelf.
  3. Change the mark type to Gantt Bar.
  4. Create a calculated field for duration:
Code
# Days between order date and ship date
DATEDIFF('day', [Order Date], [Ship Date])
  1. Drag the duration field to the Size shelf. Each bar now spans from the order date to the ship date.
  2. Drag Ship Mode to the Colour shelf to see how shipping method affects fulfilment time.
  3. Add a filter for a specific date range to focus on a meaningful time window.

[Insert screenshot of a Tableau Gantt chart showing order-to-ship duration by Sub-Category, coloured by Ship Mode]


6.6 Bullet Charts: KPI vs. Target

NoteThe Bullet Chart: The Best KPI Visualization

The bullet chart was designed by Stephen Few specifically to replace the dashboard gauge (speedometer) chart, which is visually dramatic but perceptually inefficient (Murray, 2020). A bullet chart encodes:

  • The primary measure, A thick, dark bar showing actual performance (e.g., actual sales).
  • The comparative measure, A short, thin perpendicular line showing the target or prior period (e.g., sales target).
  • Qualitative performance ranges, Background shading divided into three to five bands representing “poor,” “satisfactory,” and “good” performance ranges.

The bullet chart packs more information into a smaller space than any other KPI chart type, making it ideal for dashboards where screen space is limited.

NoteHow To: Building a Bullet Chart in Tableau
  1. Drag Sales to the Columns shelf (actual performance bar).
  2. Drag Region to the Rows shelf.
  3. Right-click the Sales axis and select Add Reference Line > Line > Value > Fixed: [target value]. Set the line style to thick and perpendicular.
  4. To add qualitative ranges: right-click the axis again and select Add Reference Line > Band. Set the band boundaries to, for example, 0–60% of target (poor), 60–85% (satisfactory), 85%–max (good). Use shades of grey for the band colours.
  5. Ensure the primary bar uses a dark, solid colour to stand out against the light grey background bands.
  6. Add a title and remove the axis label (it is already implied by the title).

[Insert screenshot of a Tableau bullet chart showing actual Sales vs. target for four Regions, with qualitative performance range bands]

TipReplacing Dashboard Gauges with Bullet Charts

If your organisation currently uses gauge or speedometer charts on its dashboards, replacing them with bullet charts will immediately improve readability and analytical value. Gauges encode a single value on a radial arc, they are visually appealing but perceptually weak. A bullet chart encodes the same value plus a target, plus context ranges, in a fraction of the space. Present both side by side to stakeholders and ask which is easier to understand quickly, bullet charts win every time.


6.7 Summary

NoteKey Concepts at a Glance
Chart Type Primary Use Key Parameter
Histogram Distribution shape of one continuous variable Bin size
Frequency polygon Compare distributions across groups Colour encoding for groups
Density plot Smoothed distribution shape Bandwidth (smoothing)
Pareto chart Identify top contributors (80/20 rule) Running total % line
Waterfall chart Decompose change between two totals Positive/negative colour encoding
Gantt chart Duration and scheduling Start date + duration size field
Bullet chart KPI vs. target with context Reference line (target) + reference bands
TipApplying This in Practice

The charts in this chapter are among the most powerful but least frequently used in business analytics. Most analysts default to bar charts and line charts because they are familiar. Introducing a Pareto chart to a quarterly business review or replacing gauge charts with bullet charts on an operational dashboard will immediately distinguish you as a sophisticated analyst. Choose one chart from this chapter to introduce to your next analysis project and observe the impact on stakeholder engagement.