T-SQL para encontrar valores crecientes consecutivos

Let's say I have the following very simple schema:

Create Table MyTable (
   PrimaryKey int,
   Column1 datetime.
   Column2 int
)

I need a query that orders the data based on Column1, and finds the first 10 consecutive rows where the value of Column2 in the current row is greater than the value of column2 in the prior row.

preguntado el 08 de noviembre de 11 a las 17:11

and order on Column 2 shoule be ascending or descending ? -

How many rows in the table? How many do you expect would need to be scanned before the sequence of 10 is found? -

@Martin - Assume less than 10K rows. Regarding the number of rows scanned before the sequence is found, it could be 1 or 9900. Hard to say. The data is not predictable. -

@Zohaib - Order of data in Column 2 is irrelevant. Ordering is done by column 1. -

Is column1 the same value for the sequences of column2? Can you post some sample data and expected output? -

1 Respuestas

Q is used to get a ranking value rn Ordenado por Column1. Added in PrimaryKey in case there are ties in Column1. C is a recursive CTE that loops from the top ordered by rn incrementándose cc for each increasing value of Column2. It will break from the recursion when cc reaches 10. Finally get the last 10 rows from C. The where clause takes care of the case when there are no 10 consecutive increasing values.

with Q as
(
  select PrimaryKey,
         Column1,
         Column2,
         row_number() over(order by Column1, PrimaryKey) as rn
  from MyTable
),
C as
(
  select PrimaryKey,
         Column1,
         Column2,
         rn,
         1 as cc
  from Q
  where rn = 1
  union all
  select Q.PrimaryKey,
         Q.Column1,
         Q.Column2,
         Q.rn,
         case 
           when Q.Column2 > C.Column2 then C.cc + 1
           else 1
         end  
  from Q
    inner join C
      on Q.rn - 1 = C.rn
  where C.cc < 10     
)
select top 10 *
from C
where 10 in (select cc from C)
order by rn desc
option (maxrecursion 0)

versión 2 As Martin Smith pointed out in a comment, the above query has really bad performance. The culprit is the first CTE. The version below use table variable to hold the ranked rows. The primary key directive on rn creates an index that will be used in the join in the recursive part of the query. Apart from the table variable this does the same as above.

declare @T table
(
   PrimaryKey int,
   Column1 datetime,
   Column2 int,
   rn int primary key
);

insert into @T
select PrimaryKey,
       Column1,
       Column2,
       row_number() over(order by Column1, PrimaryKey) as rn
from MyTable;

with C as
(
  select PrimaryKey,
         Column1,
         Column2,
         rn,
         1 as cc
  from @T
  where rn = 1
  union all
  select T.PrimaryKey,
         T.Column1,
         T.Column2,
         T.rn,
         case 
           when T.Column2 > C.Column2 then C.cc + 1
           else 1
         end  
  from @T as T
    inner join C
      on T.rn = C.rn + 1
  where C.cc < 10     
)
select top 10 *
from C
where 10 in (select cc from C)
order by rn desc
option (maxrecursion 0)

contestado el 23 de mayo de 17 a las 13:05

Just tried this on a 10,000 row table not containing a sequence of 10 and it took about 5 minutes on my desktop PC. - Martin Smith

@MartinSmith That is not good. I will ad a version with an intermediary temp table as well instead of the first CTE. It should be faster. The primary key on rn should help speed things up. If you don't mind please let me know how it runs for you. - Mikael Eriksson

@MartinSmith Thanks, terminating my test now after 7 minutes :). - Mikael Eriksson

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.