sql window functions ?
SQL window functions are powerful tools used to perform calculations across a set of table rows that are somehow related to the current row. They differ from standard aggregate functions in that they do not group the result set into a single row. Instead, they provide a way to execute complex queries involving ranking, running totals, moving averages, and more. Here are some of the most commonly used SQL window functions:
1. ROW_NUMBER()
Assigns a unique sequential integer to rows within a partition of a result set, starting at 1 for the first row in each partition.
SELECT
column1,
ROW_NUMBER() OVER (PARTITION BY column2 ORDER BY column3) AS row_num
FROM
table_name;
2. RANK()
Assigns a rank to each row within a partition of a result set. The rank of a row is one plus the number of ranks that come before it. Ties receive the same rank, and the next rank(s) are skipped.
SELECT
column1,
RANK() OVER (PARTITION BY column2 ORDER BY column3) AS rank
FROM
table_name;
3. DENSE_RANK()
Similar to RANK()
, but without gaps in the ranking sequence. Ties receive the same rank, and the next rank is incremented by 1.
SELECT
column1,
DENSE_RANK() OVER (PARTITION BY column2 ORDER BY column3) AS dense_rank
FROM
table_name;
4. NTILE(n)
Divides the result set into n
approximately equal parts and assigns each row a bucket number from 1 to n
.
SELECT
column1,
NTILE(4) OVER (ORDER BY column2) AS quartile
FROM
table_name;
5. LAG()
Provides access to a row at a given physical offset that comes before the current row within the result set.
SELECT
column1,
LAG(column2, 1) OVER (ORDER BY column3) AS prev_value
FROM
table_name;
6. LEAD()
Provides access to a row at a given physical offset that comes after the current row within the result set.
SELECT
column1,
LEAD(column2, 1) OVER (ORDER BY column3) AS next_value
FROM
table_name;
7. FIRST_VALUE()
Returns the first value in an ordered set of values.
SELECT
column1,
FIRST_VALUE(column2) OVER (PARTITION BY column3 ORDER BY column4) AS first_val
FROM
table_name;
8. LAST_VALUE()
Returns the last value in an ordered set of values.
SELECT
column1,
LAST_VALUE(column2) OVER (PARTITION BY column3 ORDER BY column4 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_val
FROM
table_name;
9. SUM()
Calculates the sum of values in a specified window frame.
SELECT
column1,
SUM(column2) OVER (PARTITION BY column3 ORDER BY column4) AS running_total
FROM
table_name;
10. AVG()
Calculates the average of values in a specified window frame.
SELECT
column1,
AVG(column2) OVER (PARTITION BY column3 ORDER BY column4) AS running_avg
FROM
table_name;
11. MIN() and MAX()
Return the minimum and maximum values in a specified window frame.
SELECT
column1,
MIN(column2) OVER (PARTITION BY column3 ORDER BY column4) AS min_val,
MAX(column2) OVER (PARTITION BY column3 ORDER BY column4) AS max_val
FROM
table_name;
Window Specification
The OVER
clause can include three parts:
- PARTITION BY: Divides the result set into partitions to which the window function is applied.
- ORDER BY: Defines the logical order of rows within each partition.
- Window Frame: Specifies a subset of the partition over which the function operates (e.g.,
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
).
Example with a complete window specification:
SELECT
column1,
SUM(column2) OVER (PARTITION BY column3 ORDER BY column4 ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS running_total
FROM
table_name;
These functions allow for sophisticated data analysis directly within SQL, providing a more efficient and readable way to handle complex calculations.