Window Functions

Other topics

Adding the total rows selected to every row

SELECT your_columns, COUNT(*) OVER() as Ttl_Rows FROM your_data_set
idnameTtl_Rows
1example5
2foo5
3bar5
4baz5
5quux5

Instead of using two queries to get a count then the line, you can use an aggregate as a window function and use the full result set as the window.
This can be used as a base for further calculation without the complexity of extra self joins.

Setting up a flag if other rows have a common property

Let's say I have this data:

Table items

idnametag
1exampleunique_tag
2foosimple
42barsimple
3bazhello
51quuxworld

I'd like to get all those lines and know if a tag is used by other lines

SELECT id, name, tag, COUNT(*) OVER (PARTITION BY tag) > 1 AS flag FROM items

The result will be:

idnametagflag
1exampleunique_tagfalse
2foosimpletrue
42barsimpletrue
3bazhellofalse
51quuxworldfalse

In case your database doesn't have OVER and PARTITION you can use this to produce the same result:

SELECT id, name, tag, (SELECT COUNT(tag) FROM items B WHERE tag = A.tag) > 1 AS flag FROM items A

Getting a running total

Given this data:

dateamount
2016-03-12200
2016-03-11-50
2016-03-14100
2016-03-15100
2016-03-10-250
SELECT date, amount, SUM(amount) OVER (ORDER BY date ASC) AS running
FROM operations
ORDER BY date ASC

will give you

dateamountrunning
2016-03-10-250-250
2016-03-11-50-300
2016-03-12200-100
2016-03-141000
2016-03-15100-100

Getting the N most recent rows over multiple grouping

Given this data

User_IDCompletion_Date
12016-07-20
12016-07-21
22016-07-20
22016-07-21
22016-07-22
;with CTE as
(SELECT *, 
        ROW_NUMBER() OVER (PARTITION BY User_ID 
                           ORDER BY Completion_Date DESC) Row_Num
FROM    Data)
SELECT * FORM CTE WHERE Row_Num <= n

Using n=1, you'll get the one most recent row per user_id:

User_IDCompletion_DateRow_Num
12016-07-211
22016-07-221

Finding "out-of-sequence" records using the LAG() function

Given these sample data:

IDSTATUSSTATUS_TIMESTATUS_BY
1ONE2016-09-28-19.47.52.501398USER_1
3ONE2016-09-28-19.47.52.501511USER_2
1THREE2016-09-28-19.47.52.501517USER_3
3TWO2016-09-28-19.47.52.501521USER_2
3THREE2016-09-28-19.47.52.501524USER_4

Items identified by ID values must move from STATUS 'ONE' to 'TWO' to 'THREE' in sequence, without skipping statuses. The problem is to find users (STATUS_BY) values who violate the rule and move from 'ONE' immediately to 'THREE'.

The LAG() analytical function helps to solve the problem by returning for each row the value in the preceding row:

SELECT * FROM (
 SELECT 
  t.*, 
  LAG(status) OVER (PARTITION BY id ORDER BY status_time) AS prev_status 
  FROM test t
) t1 WHERE status = 'THREE' AND prev_status != 'TWO'

In case your database doesn't have LAG() you can use this to produce the same result:

SELECT A.id, A.status, B.status as prev_status, A.status_time, B.status_time as prev_status_time
FROM Data A, Data B
WHERE A.id = B.id
AND   B.status_time = (SELECT MAX(status_time) FROM Data where status_time < A.status_time and id = A.id)
AND   A.status = 'THREE' AND NOT B.status = 'TWO'

Contributors

Topic Id: 647

Example Ids: 2121,2122,2123,8299,23832

This site is not affiliated with any of the contributors.