Las tablas asociadas se analizan por completo cuando se hace referencia a ellas con una instrucción 'Incluir' utilizando Entity Framework

I'm having problems with Includes when pulling data out of the entity framework. I posted a very similar question yesterday aquí, which has the same sample I'm about to use. I'm using Entity Framework v4.0.

I have the following simple model, a table containing list of Page Forms (~200). Each form has one or more Fields (~4000 total), and each field has may have some Parameters (~16000 total).

Modelo de entidad

I use the following code to pull data out:

EntityConnection myConnection = new EntityConnection("name=myModel");

if(conn.State != ConnectionState.Open) {
    conn.Open();
}
ObjectContext context = new ObjectContext("name=myModel");
context.ContextOptions.LazyLoadingEnabled = false;

ObjectQuery<PageForm> myObjectSet = context.CreateObjectSet<PageForm>()
                                           .Include("FormFields.FormFieldParameters");

IQueryable<PageForm> myFilteredObjectSet = myObjectSet.Where(c => c.FormID == 1);

List<PageForm> myReturnValue = myFilteredObjectSet.toList();

Which generates the following sql query

SELECT 
[Project1].[FormID] AS [FormID], 
[Project1].[FormName] AS [FormName], 
[Project1].[C2] AS [C1], 
[Project1].[FormID1] AS [FormID1], 
[Project1].[FieldID] AS [FieldID], 
[Project1].[FieldName] AS [FieldName], 
[Project1].[C1] AS [C2], 
[Project1].[FieldParamID] AS [FieldParamID], 
[Project1].[Value] AS [Value], 
[Project1].[FieldID1] AS [FieldID1]
FROM ( 
    SELECT 
    [Extent1].[FormID] AS [FormID], 
    [Extent1].[FormName] AS [FormName], 
    [Join1].[FieldID] AS [FieldID], 
    [Join1].[FieldName] AS [FieldName], 
    [Join1].[FormID] AS [FormID1], 
    [Join1].[FieldParamID] AS [FieldParamID], 
    [Join1].[Value] AS [Value], 
    [Join1].[FieldID1] AS [FieldID1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[FieldParamID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [dbo].[PageForm] AS [Extent1]
    LEFT OUTER JOIN (
        SELECT 
        [Extent2].[FieldID] AS [FieldID], 
        [Extent2].[FieldName] AS [FieldName], 
        [Extent2].[FormID] AS [FormID], 
        [Extent3].[FieldParamID] AS [FieldParamID], 
        [Extent3].[Value] AS [Value], 
        [Extent3].[FieldID] AS [FieldID1]
        FROM  [dbo].[FormField] AS [Extent2]
        LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] 
    ) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]
    WHERE 1 = [Extent1].[FormID]
) AS [Project1]
ORDER BY [Project1].[FormID] ASC, [Project1].[C2] ASC, [Project1].[FieldID] ASC, [Project1].[C1] ASC

Now I'm interested in this part of the query:

LEFT OUTER JOIN (
    SELECT 
    /**/
    FROM  [dbo].[FormField] AS [Extent2]
    LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] 
) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]

This seems to query the entire FormField and FormFieldParameter tables, with no filtering applied. So, what I think is happening, is that using .Include("FormField.FormFieldParameter") is akin to saying - "And return all data from these tables in the resulting query". What I really want is "Return only the data from these tables which relates to the filtered PageForm table".

Is there a way to do this? Sorry if this question sounds too simple, or similar to my previous question, but I'm really having trouble getting my head around the internals of Entity Framework.

EDIT 1:

If I change the above sample to the following which includes a where clause, then the query is orders of magnitude faster (roughly 10x faster).

LEFT OUTER JOIN (
    SELECT 
    /**/
    FROM  [dbo].[FormField] AS [Extent2]
    LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] 
    WHERE 1 = [Extent2].[FormID]
) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]

EDIT 2:

Some more information. I've figured out that If i add some funky filtering to the initial query, I can force a much more efficient query to be ran.

IQueryable<PageForm> myFilteredObjectSet = myObjectSet
                                          .Where(c => c.FormID == 1)
                                          .Where(a => a.FormFields
                                                       .Where(c => c.FormFieldParameters
                                                                    .Any(d => d.FieldID == c.FieldID))
                                                       .Any(b => b.FormID == 1)
                                           );

This causes the query to filter correctly, and runs much faster. However, I'm certain this isn't the best way go about this, as the nested Where/Any statements very quickly become a nightmare to deal with when you have more than a handful of Includes. There must be a better way?

preguntado el 31 de julio de 12 a las 13:07

That workaround doesn't eager load your relations. Even if you add Include it will still not filter includes because that feature is not supported. -

It appears to be eagerly loading everything ok for me, and the sql profiler I have running only shows a single call being made - which is when I call .ToList() on the Queryable object. -

1 Respuestas

The filtering is applied - your query contains WHERE 1 = [Extent1].[FormID]. It really loads only date you have requested. The query engine in the database will evaluate your query correctly and it will use some optimization techniques to filter records efficiently.

Respondido 31 Jul 12, 13:07

I'm sorry, but I'm sure this is not correct. If I manually update the query to include an inner where clause, the query runs much faster. I'll update my question to show this. - John

Well, it now becomes completely different question because your former question was about returning wrong data and new question is about different performance. Yes EF queries may have much worse performance but the only workaround is not using Linq and EF and writing your own SQL queries. - Ladislav Mrnka

I think you misunderstand - the data that I'm getting back in exactly as expected, in both situations. The issue is that the query generated is selecting from the entire 'FormField' table, when it should only be selected from the table where FormID == 1. I know there is an overhead in using EF, but this is such a large performance hit, that I am sure I must be doing something wrong. - John

See my question for a workaround, which works, but is ugly as sin. I'm sure there must be a better way of doing this. - John

Unfortunately you are not doing anything wrong. You can try to use .NET 4.5 - there can be some performance improvements, report it to EM conectar as a bug, report it to UserVoice de datos as request for improvement or open design discussion on EF CodePlex. You can also download EF source code from codeplex and try to fix it yourselves. - Ladislav Mrnka

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.