I've had far too many meetings today, but I think I still have my brainware in place. In my effort to improve the performance of some query I came across the following mystery (table names and fields paraphrased):
SELECT X.ADId FROM ( SELECT DISTINCT A.ADId FROM P WITH (NOLOCK) INNER JOIN A WITH (NOLOCK) ON (P.ID = A.PId) INNER JOIN dbo.fn_A(16) AS VD ON (VD.DId = A.ADId) LEFT JOIN DPR ON (LDID = A.ADId) WHERE ((A.ADId = 1) OR ((HDId IS NOT NULL) AND (HDId = 1))) AND (P.PS NOT IN(5,7)) AND (A.ASP IN (2, 3)) ) X WHERE (dbo.fn_B(X.ADId, 16) = 1)
As you will see, the contents of the inner query are mostly irrelevant. The whole point initially was that I wanted to avoid getting fn_B() called on every record cause they contained duplicate values for ADId, so I did a SELECT DISTINCT internally then filter the distinct records. Sounds reasonable right?
Here starts the mystery...
The inner query returns NO RECORDS (for the specified parameters). If I comment out the "WHERE fn_B() = 1" then the query runs in zero time (and returns no results). If I put it back on, then the query takes 6-10 seconds, again returning no results.
This seems to beat common sense, or at least MY common SQL sense :-) If the inner query returns no data, then the outer conditions should never get evaluated right?
Of course I took the time to check the actual execution plans, saved them and compared them very carefully. They are 99% identical, with nothing unusual to notice, or so I think.
I fooled around with some CTEs to get the query results in the first CTE, and then pass it to a second CTE that had some conditions guaranteed to filter no records, then evaluate the fn_B() call outside all CTEs, but the behavior was exactly the same.
Also other variations, like using the old query (that might call fn_B() multiple times with the same value) had the same behavior. If I remove the condition then I get no records in zero time. If I put it back, then no records in 10 seconds.
¿Alguna idea a alguien?
Gracias por tu tiempo :-)
PS1: I tried to reproduce the situation on tempdb using a simple query but I couldn't make it happen. It only happens on my actual tables. PS2: This query is called inside another function so putting the results in a temporary table and then further filtering them is also out of the question.
preguntado el 03 de mayo de 12 a las 17:05
Just as a note, the optimizer does not read a query the same way you do. Even when you think that a certain order should take place, or that short-circuiting might make the most sense, the optimizer still might evaluate CTEs / subqueries in an order you might not expect. A workaround you might try is selecting the first query into a #temp table and then running the function filter on the #temp table. This should force the order of evaluation even if it is completely unintuitive and much less elegant.
Also, while it may perform slower, I am curious what happens if you run the query without the NOLOCK, or in RCSI instead. Different locking semantics may be tripping up the optimizer.
We submitted the issue to Microsoft support for SQL Server R2 (I must comment on their amazing response times and overall service procedures). We gave them a copy of our DB that reproduces the issue, and our workaround, they reproduced it themselves and after a couple of days here is the answer we got back:
I have analyzed both execution plans and would kindly ask if the workaround would be acceptable to use in production? The main reason behind it, is that a function does not have, as indexes have, statistics. And this lack of data makes the optimizer choose sometimes a not so good execution plan. If you already found a workaround it is best to implement this. The index changes we tried did not improve the execution.
This is quite a diplomatic way to say "yeah, the optimizer messes things up with your query, so please use the workaround". If you wanna call it a bug, call it a bug, it doesn't matter.
Just for the record, the workaround was to put the call to fn_B() in the SELECT list of a query one level above the SELECT DISTINCT, then filter its result on the WHERE condition. Kind of weird, but it does the trick.