Cómo lograr una clasificación basada en dónde coincide la cláusula (sin indexación de texto completo)

I have a search query that is dynamically compiled using NHibernate's criteria query mechanism. The resulting SQL query might look like:

select 
    *
from
    sometable
where
(
    (
        firstname like 'chris%' or
        lastname like 'chris%'
    )
    and
    (
        firstname like 'vann%' or
        lastname like 'vann%'
    )    
)

The data in the table might look like:

FirstName         LastName
------------------------------
Chris             Smith
John              Vann
Chris             Vann

I'd like to order the results such that a row matching both sub-clauses in the where clause (i.e. firstname = Chris and lastname = Vann) is ranked higher than a row matching only one of the sub-clauses. Is this possible in standard SQL?

Edit: I greatly simplified the question to get down to the guts of the problem.

preguntado el 30 de enero de 12 a las 20:01

Old question, I know, but that where statement is only going to match with both sub-clauses because of the 'and'. So just Chris Vann in your sample data (and a hypothetical Vann Chris). It sounds like you want Chris to match in either the first or last name OR Vann to match in the first or last name and then rank those assigning a match in both the highest ranking. -

2 Respuestas

This is only a point to start. You can create a calculate priority column and sort rows by this column. The column is a indicator for well match row. Here a sample code writed for you:

create table #t (f varchar(10), l varchar(10) );

insert into #t values ('aa','ee'),('aa','ii'),('oo','ee');

select 
   *,
   case when f like 'aa%' then 1 else 0 end +
   case when l like 'aa%' then 1 else 0 end +
   case when f like 'ii%' then 1 else 0 end + 
   case when l like 'aa%' then 1 else 0 end 
   as priority
from #t
order by 
   priority desc

Resultados:

f  l  priority 
-- -- -------- 
aa ee 4        
aa ii 4        
oo ee 0 

For your schema may be something like:

select 
    *.
    case when firstname like 'chris%' and lastname like 'vann%' then 4 else 0 +
    case when firstname like 'chris%' and lastname not like 'vann%' then 3 else 0 +
    case when firstname not like 'chris%' and lastname like 'vann%' then 3 else 0 +
    ...
    as priority
from
    sometable
where
(
    (
        firstname like 'chris%' or
        lastname like 'chris%'
    )
    and
    (
        firstname like 'vann%' or
        lastname like 'vann%'
    )    
)
order by priority desc

Respondido el 31 de enero de 12 a las 00:01

I thought about that, but that approach involves a whole lot of adhoc (non-parameterized) SQL statements, something I generally try to avoid. - Chris

@Chris, Yes, I know, I have posted this approach because you ask for a standard sql way to do this: "Is this possible in standard SQL?". As I say: "This is only a point to start" perhaps this can helps to your finally solution. Regards. - dani herrera

Here's a T-SQL ranking that I drummed up, seems to work pretty well.

  • It ranks using Difference function over each search pair, first-first, first-last, last-first, last-last, then adds weight when a substring match of the search terms are found in the first or last name with first-first and last-last matches weighted heavier.
  • It orders further by those having exact substring matches and then having the earliest matches, then having the smallest difference between the length of the search string and the length of the first/lastname.
  • The weighting factors (* 2, * 4) in TotalRank are arbitrary and just reflect my desire to weight more heavily matches that are first-first and last-last.
  • The SQL below has a lot of extra columns demonstrating the components that go into the TotalRank column. You can obviously remove these.

`

DECLARE @searchFirst varchar(max) = 'chris';

DECLARE @searchLast varchar(max) = 'vann';

SELECT firstname, lastname,

SOUNDEX(@searchFirst) as FSearchSoundEx,

SOUNDEX(firstname) as FSoundEx,

DIFFERENCE(firstname, @searchFirst) as FDiff,

LEN(firstName) - LEN(@searchFirst) as FFDelta,


SOUNDEX(lastname) as LSoundEx,

SOUNDEX(@searchLast) as LSearchSoundEx,

DIFFERENCE(lastName, @searchLast) as LDiff,

LEN(lastName) - LEN(@searchLast) as LLDelta,


PATINDEX('%' + @searchFirst + '%', firstname) as FFIndex,

PATINDEX('%' + @searchFirst + '%', lastname) as FLIndex,

PATINDEX('%' + @searchLast + '%', firstname) as LFIndex,

PATINDEX('%' + @searchLast + '%', lastname) as LLIndex,

CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', firstname)) as HasFF,

CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', lastname)) as HasFL, 

CONVERT(BIT, PATINDEX('%' + @searchLast + '%', firstname)) as HasLF,

CONVERT(BIT, PATINDEX('%' + @searchLast + '%', lastname)) as HasLL,

DIFFERENCE(firstname, @searchFirst) * DIFFERENCE(firstname, @searchFirst) as FFDiffSq, DIFFERENCE(lastname, @searchFirst) * DIFFERENCE(lastname, @searchFirst) as FLDiffSq, DIFFERENCE(firstname, @searchLast) * DIFFERENCE(firstname, @searchLast) as LFDiffSq, DIFFERENCE(lastname, @searchLast) * DIFFERENCE(lastname, @searchLast) as LLDiffSq,

DIFFERENCE(firstname, @searchFirst) * DIFFERENCE(firstname, @searchFirst) + DIFFERENCE(lastname, @searchFirst) * DIFFERENCE(lastname, @searchFirst) + DIFFERENCE(firstname, @searchLast) * DIFFERENCE(firstname, @searchLast) + Difference(lastname, @searchLast) * Difference(lastname, @searchLast) as SumDiffSquares,

DIFFERENCE(firstname, @searchFirst) * DIFFERENCE(firstname, @searchFirst) * 2 + DIFFERENCE(lastname, @searchFirst) * DIFFERENCE(lastname, @searchFirst) + DIFFERENCE(firstname, @searchLast) * DIFFERENCE(firstname, @searchLast) + DIFFERENCE(lastname, @searchLast) * DIFFERENCE(lastname, @searchLast) * 2
+ CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', firstname)) * 4 + CONVERT(BIT, PATINDEX('%' + @searchFirst + '%', lastname)) + CONVERT(BIT, PATINDEX('%' + @searchLast + '%', firstname)) + CONVERT(BIT, PATINDEX('%' + @searchLast + '%', lastname)) * 4 as TotalRank

FROM Contacts

ORDER BY TotalRank Desc, HasLL Desc, HasFF Desc, HasFL Desc, HasLF Desc, LLIndex, FFIndex, FLIndex, LFIndex, LLDelta, FFDelta

Respondido el 14 de enero de 19 a las 04:01

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.