- PIVOT
The meaning of pivoting a table is to turn n-rows into n-columns.
For example given a ales table with a composite primary key of year and quarter you may want to get a result that shows only one row for each year, but a separate column for each quarter.
CREATE TABLE
Sales(Year INTEGER NOT NULL,
Quarter INTEGER NOT NULL,
Results INTEGER);
CREATE UNIQUE INDEX PK_IDX ON Sales(Year, Quarter) INCLUDE(Results);
ALTER TABLE Sales ADD PRIMARY KEY (Year, Quarter);
INSERT INTO Sales VALUES
(2004, 1, 20),
(2004, 2, 30),
(2004, 3, 15),
(2004, 4, 10),
(2005, 1, 18),
(2005, 2, 40),
(2005, 3, 12),
(2005, 4, 27);
Quarter INTEGER NOT NULL,
Results INTEGER);
CREATE UNIQUE INDEX PK_IDX ON Sales(Year, Quarter) INCLUDE(Results);
ALTER TABLE Sales ADD PRIMARY KEY (Year, Quarter);
INSERT INTO Sales VALUES
(2004, 1, 20),
(2004, 2, 30),
(2004, 3, 15),
(2004, 4, 10),
(2005, 1, 18),
(2005, 2, 40),
(2005, 3, 12),
(2005, 4, 27);
The desired result is:
Year Q1 Q2 Q3 Q4
---- -- -- -- --
2004 20 30 15 10
2005 18 40 12 27
---- -- -- -- --
2004 20 30 15 10
2005 18 40 12 27
There a many ways to achieve
pivoting, but there is clearly one that is optimal which I will explain here.
Pivoting involves two events: First the payload (Results in this case) needs to be dispatch into the Q1 through Q4 columns.
This can be done with a CASE expression or DECODE (which is another syntax for CASE):
Pivoting involves two events: First the payload (Results in this case) needs to be dispatch into the Q1 through Q4 columns.
This can be done with a CASE expression or DECODE (which is another syntax for CASE):
SELECT Year,
DECODE(Quarter, 1, Results) AS Q1,
DECODE(Quarter, 2, Results) AS Q2,
DECODE(Quarter, 3, Results) AS Q3,
DECODE(Quarter, 4, Results) AS Q4
FROM Sales;
DECODE(Quarter, 1, Results) AS Q1,
DECODE(Quarter, 2, Results) AS Q2,
DECODE(Quarter, 3, Results) AS Q3,
DECODE(Quarter, 4, Results) AS Q4
FROM Sales;
YEAR Q1 Q2 Q3 Q4
----------- ----------- ----------- ----------- -----------
2004 20 - - -
2004 - 30 - -
2004 - - 15 -
2004 - - - 10
2005 18 - - -
2005 - 40 - -
2005 - - 12 -
2005 - - - 27
8 record(s) selected.
Note how DECODE injects NULL
as an implicit ELSE.
The second step is to collapse the rows to get one row per Year.
The technique of choice is a GROUP BY here.
When grouping on Year we need to use some sort of column function for the columns Q1 through Q4.
MAX or MIN() work for most types and they are very cheap. Especially since all but one value will be NULL per group:
The second step is to collapse the rows to get one row per Year.
The technique of choice is a GROUP BY here.
When grouping on Year we need to use some sort of column function for the columns Q1 through Q4.
MAX or MIN() work for most types and they are very cheap. Especially since all but one value will be NULL per group:
SELECT Year,
MAX(DECODE(Quarter, 1, Results)) AS Q1,
MAX(DECODE(Quarter, 2, Results)) AS Q2,
MAX(DECODE(Quarter, 3, Results)) AS Q3,
MAX(DECODE(Quarter, 4, Results)) AS Q4
FROM Sales
GROUP BY Year;
MAX(DECODE(Quarter, 1, Results)) AS Q1,
MAX(DECODE(Quarter, 2, Results)) AS Q2,
MAX(DECODE(Quarter, 3, Results)) AS Q3,
MAX(DECODE(Quarter, 4, Results)) AS Q4
FROM Sales
GROUP BY Year;
YEAR Q1 Q2 Q3 Q4
----------- ----------- ----------- ----------- -----------
2004 20 30 15 10
2005 18 40 12 27
2 record(s) selected.
I noted above that this
approach is the best. How do I know. Well let's look at the access plan:
Access Plan:
-----------
Total Cost: 0.0134932
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
1.6
GRPBY
( 2)
0.01322
0
|
8
IXSCAN
( 3)
0.0126886
0
|
8
INDEX: SERGE
PK_IDX
Q1
-----------
Total Cost: 0.0134932
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
1.6
GRPBY
( 2)
0.01322
0
|
8
IXSCAN
( 3)
0.0126886
0
|
8
INDEX: SERGE
PK_IDX
Q1
Pretty hard to beat!
- UNPIVOT
Unpivot is the inverse operation of PIVOT. Here we have several similar columns which are to be combined into one column, but different rows.
Let's reuse the same example from above and run it backwards so to speak
CREATE TABLE Sales(Year INTEGER
NOT NULL PRIMARY KEY,
Q1 INTEGER,
Q2 INTEGER,
Q3 INTEGER,
Q4 INTEGER);
INSERT INTO Sales VALUES
(2004, 20, 30, 15, 10),
(2005, 18, 40, 12, 27);
Q1 INTEGER,
Q2 INTEGER,
Q3 INTEGER,
Q4 INTEGER);
INSERT INTO Sales VALUES
(2004, 20, 30, 15, 10),
(2005, 18, 40, 12, 27);
The most efficient way to
unpivot is to do a cross join between the Sales table and a correlated VALUES
of as many rows as columns that need to be unpivoted.
SELECT Year, Quarter, Results
FROM Sales AS S,
LATERAL(VALUES (1, S.Q1),
(2, S.Q2),
(3, S.Q3),
(4, S.Q4)) AS T(Quarter, Results);
FROM Sales AS S,
LATERAL(VALUES (1, S.Q1),
(2, S.Q2),
(3, S.Q3),
(4, S.Q4)) AS T(Quarter, Results);
YEAR QUARTER RESULTS
----------- ----------- -----------
2004 1 20
2004 2 30
2004 3 15
2004 4 10
2005 1 18
2005 2 40
2005 3 12
2005 4 27
8 record(s) selected.
Again let's look at the
explain:
Access Plan:
-----------
Total Cost: 7.57432
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
8
NLJOIN
( 2)
7.57432
1
/-------+-------\
2 4
TBSCAN TBSCAN
( 3) ( 4)
7.57317 4.95961e-005
1 0
| |
2 4
TABLE: SERGE TABFNC: SYSIBM
SALES GENROW
Q3 Q1
-----------
Total Cost: 7.57432
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
8
NLJOIN
( 2)
7.57432
1
/-------+-------\
2 4
TBSCAN TBSCAN
( 3) ( 4)
7.57317 4.95961e-005
1 0
| |
2 4
TABLE: SERGE TABFNC: SYSIBM
SALES GENROW
Q3 Q1
Once more this is the most efficient plan. Since it only contains a single
scan.
댓글 없음:
댓글 쓰기