Friday, 20 March 2009

SQL and contiguous data ranges.

As the saying goes "work smarter, not harder", this is definitely the case for SQL. If you have a poorly performing query, then you are not only affecting that one user but diverting precious machine resources needlessly. This is doubly true if its part of the 20% of code executed 80% of the time. Here's one smart solution to a common problem, and after the umpteenth newsgroup question which could be solved by this , I thought id take a little time out to dissect the theory and give a few good examples. Steve Kass of Drew University , seems to be the man to credit for this idea, although i cant find any specific details.

The problem is given a list of numbers we need to output the covered ranges. The key is to simply generate an unique number to group your contiguous ranges by. The value of this number is completely irrelevant , it just has to be the same for each item within a range.

So lets start simply , if we have a range of numbers (1,2,3,5,6,7,8), what we need returned is:
1 - 3
5 - 8

Lets generate some test data



Drop table Contig
go
create table Contig(Id integer)
go
insert into contig values(1)
insert into contig values(2)
insert into contig values(3)
insert into contig values(5)
insert into contig values(6)
insert into contig values(7)
insert into contig values(8)



If we then execute




Select Id,row_number() over (order by Id desc)
from Contig


we get :

Id          RowNum
----------- --------------------
8 1
7 2
6 3
5 4
3 5
2 6
1 7


What we are going to do is generate our 'grouping number' by adding Id and Rownum. So now execute



with cteContigOrder(Id,RowNum)
as
(
Select Id,RowNum = row_number() over (order by Id desc) from Contig
)
Select Id,RowNum,Grouping = Id+RowNum from cteContigOrder


which returns
Id          RowNum               Grouping
----------- -------------------- --------------------
8 1 9
7 2 9
6 3 9
5 4 9
3 5 8
2 6 8
1 7 8

Looking at this its easy to see how id 1 thru 3 have a grouping of 8 and id 5 thru 8 have a grouping of 9.

So to generate our ranges we need do this :


;with cteContigOrder(Id,RowNum)
as
(
Select Id,RowNum = row_number() over (order by Id desc) from Contig
)
Select Min(Id),Max(Id) from cteContigOrder
group by Id+RowNum order by 1



which gives us

----------- -----------
1 3
5 8


exactly the ranges we wanted.

Using this it very easy to apply to Dates.



Drop Table ContigDates
go
Create Table ContigDates
(
DateCol smalldatetime
)
go
insert into ContigDates(DateCol) values('01jan2009')
insert into ContigDates(DateCol) values('02jan2009')
insert into ContigDates(DateCol) values('03jan2009')
insert into ContigDates(DateCol) values('01feb2009')
insert into ContigDates(DateCol) values('02feb2009')
insert into ContigDates(DateCol) values('03feb2009')
insert into ContigDates(DateCol) values('10feb2009')
go
with cteDateList(DateCol,Grouping)
as
(
Select DateCol,DateCol + row_number() over (order by datecol desc)
from ContigDates
)
Select Min(DateCol),Max(DateCol)
from cteDateList
group by Grouping
order by 1
go


If you have a table which itself is a range of dates , i find it simplest to 'explode' the range and then re-summarize. Here I have used a Calendar table to find all the days within the ranges.



Drop Table ContigDateRanges
go
Create Table ContigDateRanges
(
DateCollo smalldatetime,
DateColHi smalldatetime
)
go
insert into ContigDateRanges(Datecollo,Datecolhi)
values('01jan2009','03jan2009')
insert into ContigDateRanges(Datecollo,Datecolhi)
values('04jan2009','10jan2009')
insert into ContigDateRanges(Datecollo,Datecolhi)
values('01feb2009','01feb2009')
insert into ContigDateRanges(Datecollo,Datecolhi)
values('02feb2009','05feb2009')
go
with cteDateList(DateCol,Grouping)
as
(
Select Calendar.Dte,
Calendar.Dte + row_number() over (order by Calendar.Dte desc)
from ContigDateRanges,
Calendar
where Calendar.Dte between DateColLo and DateColHi
)
Select Min(DateCol),Max(DateCol)
from cteDateList
group by Grouping
order by 1

I hope that this gives you at least one more option to improve your system and to remove a few more evil cursors.

No comments:

Post a Comment