Grouping Rows with GROUP BY

Reverse Engineering GROUPING BY Queries

At times we are faced with a complex GROUP BY query written by someone else and figuring out the equivalent UNION ALL query can help us better understand its results. This is not as easy as it first may seem. A query like this, for example,

set null (null)

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ROLLUP ( GRP_A, GRP_B )
order by
  grp_a
, grp_b
;
 
GRP_A      GRP_B      NVL_GRP_A_ NVL2_GRP_B   COUNT(*)
---------- ---------- ---------- ---------- ----------
a1         b1         b1                  1          2
a1         b2         b2                  1          3
a1         (null)     a1                  0          5
a2         b3         b3                  1          3
a2         (null)     a2                  0          3
(null)     (null)     (null)              0          8
 

is not simply the result of unioning together three identical subqueries with different GROUP BY clauses.

set null '(null)'

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ()
UNION ALL
select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ( GRP_A )
UNION ALL
select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ( GRP_A, GRP_B )
order by
  grp_a
, grp_b
;
  grp_a
  *
ERROR at line 2:
ORA-00979: not a GROUP BY expression


 

As you can see, such a query produces an error because the first and second subqueries select columns that are not in the GROUP BY clause. To determine the real equivalent UNION query we can use the following algorithm.

Step 1

Replace any ROLLUP or CUBE operators with their equivalent GROUPING SETS operator. In our example the query

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ROLLUP ( GRP_A, GRP_B )
order by
  grp_a
, grp_b
;
 
GRP_A      GRP_B      NVL_GRP_A_ NVL2_GRP_B   COUNT(*)
---------- ---------- ---------- ---------- ----------
a1         b1         b1                  1          2
a1         b2         b2                  1          3
a1         (null)     a1                  0          5
a2         b3         b3                  1          3
a2         (null)     a2                  0          3
(null)     (null)     (null)              0          8
 

is replaced with

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY GROUPING SETS
  (
    ()
  , ( GRP_A )
  , ( GRP_A, GRP_B )
  )
order by
  grp_a
, grp_b
;
 
GRP_A      GRP_B      NVL_GRP_A_ NVL2_GRP_B   COUNT(*)
---------- ---------- ---------- ---------- ----------
a1         b1         b1                  1          2
a1         b2         b2                  1          3
a1         (null)     a1                  0          5
a2         b3         b3                  1          3
a2         (null)     a2                  0          3
(null)     (null)     (null)              0          8
 

Step 2a

Next start with a query that groups by only the first term in the GROUPING SETS clause, which is an empty set in our example.

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY () ;
 

If the SELECT list contains columns that are not in the GROUP BY clause then replace those columns with NULL. In the query above both GRP_A and GRP_B are absent from the GROUP BY clause so we replace all occurrences of these columns in the SELECT list with NULL.

column grp_a       format a6
column grp_b       format a6
column nvl_grp_a_b format a11
column nvl2_grp_b  format 999999999

select
  NULL as grp_a
, NULL as grp_b
, nvl( NULL, NULL )  as nvl_grp_a_b
, nvl2( NULL, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY () ;
 
GRP_A  GRP_B  NVL_GRP_A_B NVL2_GRP_B   COUNT(*)
------ ------ ----------- ---------- ----------
(null) (null) (null)               0          8
 

Step 2b

Now we repeat the first step using the second term in the GROUPING SETS clause, ( GRP_A ).

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ( GRP_A ) ;
 

This time GRP_B is in the SELECT list but it is not in the GROUP BY list. We therefore need to replace GRP_B with NULL.

select
  grp_a
, NULL as grp_b
, nvl( NULL, grp_a ) as nvl_grp_a_b
, nvl2( NULL, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ( GRP_A ) ;
 
GRP_A  GRP_B  NVL_GRP_A_B NVL2_GRP_B   COUNT(*)
------ ------ ----------- ---------- ----------
a1     (null) a1                   0          5
a2     (null) a2                   0          3
 

Step 2c

For the last set in the GROUPING SETS clause all selected columns are listed in the GROUP BY clause so no further transformation is needed. We can use the original SELECT list as-is.

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ( GRP_A, GRP_B ) ;
 
GRP_A  GRP_B  NVL_GRP_A_B NVL2_GRP_B   COUNT(*)
------ ------ ----------- ---------- ----------
a1     b1     b1                   1          2
a1     b2     b2                   1          3
a2     b3     b3                   1          3
 

Step 3

The next step is to combine the three step 2 queries with UNION ALL and add an ORDER BY clause.

select
  NULL as grp_a
, NULL as grp_b
, nvl( NULL, NULL )  as nvl_grp_a_b
, nvl2( NULL, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
group by ()
UNION ALL
select
  grp_a
, NULL as grp_b
, nvl( NULL, grp_a ) as nvl_grp_a_b
, nvl2( NULL, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
group by ( grp_a )
UNION ALL
select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
group by ( grp_a, grp_b )
ORDER BY
  GRP_A
, GRP_B
;
 
GRP_A  GRP_B  NVL_GRP_A_B NVL2_GRP_B   COUNT(*)
------ ------ ----------- ---------- ----------
a1     b1     b1                   1          2
a1     b2     b2                   1          3
a1     (null) a1                   0          5
a2     b3     b3                   1          3
a2     (null) a2                   0          3
(null) (null) (null)               0          8
 

Step 4 (Optional)

Lastly we reduce expressions like nvl( NULL, NULL ) and nvl2( NULL , 1, 0 ) to simpler, equivalent terms.

select
  null as grp_a
, null as grp_b
, NULL AS NVL_GRP_A_B
, 0    AS NVL2_GRP_B
, count(*)
from
  t
group by ()
union all
select
  grp_a
, null as grp_b
, GRP_A   AS NVL_GRP_A_B
, 0       AS NVL2_GRP_B
, count(*)
from
  t
group by ( grp_a )
union all
select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
group by ( grp_a, grp_b )
order by
  grp_a
, grp_b
;
 
GRP_A  GRP_B  NVL_GRP_A_B NVL2_GRP_B   COUNT(*)
------ ------ ----------- ---------- ----------
a1     b1     b1                   1          2
a1     b2     b2                   1          3
a1     (null) a1                   0          5
a2     b3     b3                   1          3
a2     (null) a2                   0          3
(null) (null) (null)               0          8
 

Result

The end result of the last step is a query which returns the same rows as the original GROUPING SETS query, which is repeated below for your convenience.

select
  grp_a
, grp_b
, nvl( grp_b, grp_a ) as nvl_grp_a_b
, nvl2( grp_b, 1, 0 ) as nvl2_grp_b
, count(*)
from
  t
GROUP BY ROLLUP ( GRP_A, GRP_B )
order by
  grp_a
, grp_b
;
 
GRP_A  GRP_B  NVL_GRP_A_B NVL2_GRP_B   COUNT(*)
------ ------ ----------- ---------- ----------
a1     b1     b1                   1          2
a1     b2     b2                   1          3
a1     (null) a1                   0          5
a2     b3     b3                   1          3
a2     (null) a2                   0          3
(null) (null) (null)               0          8
 



Linking to SQL Snippets ™

To link to this page in Oracle Technology Network Forums or OraFAQ Forums cut and paste this code.

  • [url=http://www.sqlsnippets.com/en/topic-13124.html]SQL Snippets: Grouping Rows with GROUP BY - Reverse Engineering GROUPING BY Queries[/url]

To link to this page in HTML documents or Blogger comments cut and paste this code.

  • <a href="http://www.sqlsnippets.com/en/topic-13124.html">SQL Snippets: Grouping Rows with GROUP BY - Reverse Engineering GROUPING BY Queries</a>

To link to this page in other web sites use the following values.

  • Link Text : SQL Snippets: Grouping Rows with GROUP BY - Reverse Engineering GROUPING BY Queries
  • URL (href): http://www.sqlsnippets.com/en/topic-13124.html