A nice feature of SQL Server's query planner is that if you do not use any columns from a joined table, and the join does not affect the cardinality of the query, it can be eliminated altogether. You can create a view with lots of columns from lots of tables, but if someone selects only a few columns from the view, most of the joined tables don't need to be accessed at all.
The property of "does not affect the cardinality" can be guaranteed by an ordinary join
(also called inner join
) to another table along a foreign key relationship, or by a left join
(also called left outer join
) to another table on a unique set of columns. In both cases, the join could be removed without changing the number of rows returned.
However, I have a view that left join
s to some other views, which are fairly complex. I can see that the left join
does not affect the cardinality, because I know that the columns specify at most one row in the underlying view. But the query planner doesn't know that, and drags in lots of tables from the underlying view to the query plan. (For now, I can't simplify the underlying view, make it an indexed view, replace it with a table or whatever.)
The question is more general than just views. For any left join
, to a view or table, how can I make sure it can be eliminated from the query if no columns are picked? I have found a solution which I will post as a reply.
A nice feature of SQL Server's query planner is that if you do not use any columns from a joined table, and the join does not affect the cardinality of the query, it can be eliminated altogether. You can create a view with lots of columns from lots of tables, but if someone selects only a few columns from the view, most of the joined tables don't need to be accessed at all.
The property of "does not affect the cardinality" can be guaranteed by an ordinary join
(also called inner join
) to another table along a foreign key relationship, or by a left join
(also called left outer join
) to another table on a unique set of columns. In both cases, the join could be removed without changing the number of rows returned.
However, I have a view that left join
s to some other views, which are fairly complex. I can see that the left join
does not affect the cardinality, because I know that the columns specify at most one row in the underlying view. But the query planner doesn't know that, and drags in lots of tables from the underlying view to the query plan. (For now, I can't simplify the underlying view, make it an indexed view, replace it with a table or whatever.)
The question is more general than just views. For any left join
, to a view or table, how can I make sure it can be eliminated from the query if no columns are picked? I have found a solution which I will post as a reply.
1 Answer
Reset to default 3The reason the left join
cannot be eliminated is that it might increase the number of rows found. (It cannot decrease the number of rows, since if no row is found in the joined table it's taken as a single row of nulls, but it's possible more than one matching row might be found.)
But you can convert it to an outer apply
and inside that subquery say select top 1
. This guarantees that at most one row will be found. And because an outer apply
always returns at least one row (perhaps with all nulls), we have guaranteed a single matching row, so the query's cardinality is unchanged and, if none of its columns are needed, the outer apply
could be removed altogether without changing the set of rows returned. I didn't expect the query planner would be smart enough to spot this, but it works!
Let me give an example using just tables.
create table #num (n int not null)
insert into #num values (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)
create unique clustered index idx on #num (n)
select n as e into #even from #num where n % 2 = 0
go
select n
from #num
left join #even
on n = e
We are only using a column from #num
. The query joins to #even
but does not use any columns from it. As there's no unique index on #even
, the query planner doesn't know that the left join won't change the count of rows (there might be a number that appears twice in #even
), so it has to include it in the plan:
We could fix that with
create unique clustered index idx on #even (e)
After creating that index, the final query plan is simpler, because the left join
can be eliminated:
But supposing we could not create such an index. Perhaps the values aren't unique in all cases, but we happen to know they will be unique in the particular case of this query. Or perhaps instead of a table we are left join
ing to a view, where we know there is still a unique key to join on, but the query planner isn't smart enough to look through the view definition and spot this. In that case you can still promise that at most one row will be found, by converting the left join
to an outer apply
, and using select top 1
:
select n
from #num
outer apply (select top 1 e from #even where e = n) e1
This also gives the simple query plan. Of course, if you later add to the select
clause and start pulling back the column e
, it has to join to #even
once more. But doing so does not have a significantly different query plan to an ordinary left join
. So by doing this trick with outer apply (select top 1 ...)
, you can make a view or query fragment that performs as well as possible whether you are pulling back columns from the joined table or not.
(The plan diagrams above are generated using Plan Explorer as they look prettier than the ones from SSMS.)
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744930033a4601678.html
评论列表(0条)