Lesson 4.5: Advanced Aggregation with ROLLUP, CUBE, and GROUPING SETS in SQL
As reporting needs grow, regular GROUP BY is often not enough. For example, you may need to get all of the following at once:
- detail by order status and customer;
- intermediate totals by status;
- totals by customer;
- a grand total for the whole dataset.
You can write multiple queries and combine them with UNION ALL, but that is verbose and harder to maintain. For these tasks, SQL uses grouping expression modifiers: ROLLUP, CUBE, and GROUPING SETS.
More precisely: these modifiers enrich aggregated query results with rows at a higher level of aggregation (subtotals and grand total).
Important: all practical examples in this lesson use SQL Server (AdventureWorks).
Syntax note: ROLLUP, CUBE, GROUPING SETS, and GROUPING() below are shown using SQL Server syntax. In MySQL, functionality is more limited and syntax is partly different (for example, WITH ROLLUP is commonly used, while CUBE and GROUPING SETS may be unavailable in classic form).
In this lesson, we will cover:
- how
ROLLUP,CUBE, andGROUPING SETSdiffer; - how subtotal and total rows are built;
- how to distinguish generated total rows using
GROUPING().
Why This Matters
Advanced aggregation helps you:
- build multi-level reports in a single query;
- reduce SQL duplication;
- produce consistent detail, subtotals, and grand totals.
- enrich detail-level output with rows at higher aggregation levels.
Core Idea
Assume we have sales data in SalesOrderHeader with dimensions Status, CustomerID and metric TotalDue.
Regular GROUP BY returns only one grouping level. Extended grouping constructs return multiple levels at once.
ROLLUP: Hierarchical Totals
ROLLUP builds a hierarchy from right to left in the column list.
Syntax
GROUP BY ROLLUP (col1, col2, col3)
Generated levels:
(col1, col2, col3)- detail;(col1, col2)- subtotal overcol3;(col1)- subtotal overcol2andcol3;()- grand total.
Example: order amount totals by status and customer
SELECT
Status,
CustomerID,
SUM(TotalDue) AS total_amount
FROM SalesOrderHeader
GROUP BY ROLLUP (Status, CustomerID)
ORDER BY Status, CustomerID;
Result:
- rows for each
Status + CustomerIDpair; - subtotal per
Status; - overall grand total.
CUBE: All Dimension Combinations
CUBE builds aggregates for all possible combinations of listed columns.
Syntax
GROUP BY CUBE (col1, col2)
For two columns, levels are:
(col1, col2);(col1);(col2);().
For three columns, combinations are already $2^3 = 8$, so result size can grow quickly.
Example: order amount totals by status and customer across all slices
SELECT
Status,
CustomerID,
SUM(TotalDue) AS total_amount
FROM SalesOrderHeader
GROUP BY CUBE (Status, CustomerID)
ORDER BY Status, CustomerID;
Result: besides detail and grand total, you also get:
- totals per
Status; - totals per
CustomerID.
GROUPING SETS: Precise Control of Levels
GROUPING SETS lets you explicitly list only the grouping levels you need.
Syntax
GROUP BY GROUPING SETS (
(col1, col2),
(col1),
()
)
Example: only required levels, no extra combinations
SELECT
Status,
CustomerID,
SUM(TotalDue) AS total_amount
FROM SalesOrderHeader
GROUP BY GROUPING SETS (
(Status, CustomerID),
(Status),
()
)
ORDER BY Status, CustomerID;
This is equivalent to multiple GROUP BY ... UNION ALL ... queries, but is more compact and usually better optimized.
Distinguishing Total Rows with GROUPING()
In generated total rows, dimension values often become NULL. The issue is that source data can also contain real NULL values.
GROUPING(column) helps distinguish them:
0- regular value from source data;1- value generated by aggregation level.
Example with level flags
SELECT
Status,
CustomerID,
SUM(TotalDue) AS total_amount,
GROUPING(Status) AS g_status,
GROUPING(CustomerID) AS g_customer
FROM SalesOrderHeader
GROUP BY ROLLUP (Status, CustomerID)
ORDER BY Status, CustomerID;
Practical report labeling pattern:
CASE
WHEN GROUPING(Status) = 1 AND GROUPING(CustomerID) = 1 THEN 'GRAND TOTAL'
WHEN GROUPING(CustomerID) = 1 THEN 'STATUS SUBTOTAL'
ELSE 'DETAIL'
END AS row_type
When to Use What
- Use
ROLLUPwhen you need hierarchical totals (for example, year -> month -> day). - Use
CUBEwhen you need all analytical slices across dimensions. - Use
GROUPING SETSwhen you want strict control over exactly which levels are returned.
Practical Recommendations
- Always check result size:
CUBEcan increase row count significantly. - Label row types (
DETAIL,SUBTOTAL,GRAND TOTAL) for readability. - Add explicit
ORDER BYso totals appear in a predictable order. - If you need aggregate filtering, combine with
HAVING.
MySQL Example
Below is a MySQL example on the payment table using subtotals with WITH ROLLUP:
SELECT
staff_id,
customer_id,
SUM(amount) AS total_amount
FROM
payment
GROUP BY
staff_id, customer_id WITH ROLLUP
ORDER BY
GROUPING(staff_id),
staff_id,
GROUPING(customer_id),
customer_id;
In this query:
- detail is returned per
staff_id + customer_idpair; WITH ROLLUPadds subtotals perstaff_idand a grand total;ORDER BY GROUPING(...)places rows in a convenient order: details, subtotals, then grand total.
Important notes for MySQL:
WITH ROLLUPprovides hierarchical totals, but not a full equivalent ofCUBE/GROUPING SETS.- For more complex grouping combinations, you often need multiple queries with
UNION ALL. - If your MySQL version does not support
GROUPING(), sorting and total-row labeling are usually done withNULLchecks.
Practical Usage
Order amount report by status and customer with totals:
ROLLUP (Status, CustomerID)gives detail, status subtotals, and grand total.Multidimensional sales analysis:
CUBE (Status, CustomerID)gives all slice combinations across status and customer.Custom order amount report:
GROUPING SETSlets you keep only the levels you need: detail + departmental subtotal + grand total.
Key Takeaways from This Lesson
ROLLUP,CUBE, andGROUPING SETSextend standardGROUP BY.ROLLUPcreates hierarchical totals,CUBEcreates all combinations,GROUPING SETScreates only explicitly listed levels.GROUPING()is essential for correctly interpreting generated total rows.- These tools help build flexible analytical reports on order amounts in a single query.
By mastering these constructs, you can design more powerful SQL reports without long UNION ALL chains.