Archive

You are currently browsing the Log et al – Peter Curd blog archives for February, 2010.

Feb

9

Using IN on concatenated keys in Microsoft SQL Server 2008 is very slow

By pcurd

During the migration of T-SQL code from Microsoft SQL Server 2000 to Microsoft SQL Server 2008 I found a few procedures that were running very slow. They all used the IN keyword to filter records based on the content of a concatenated key (i.e. a key using two or more fields added to each other).

I have written a quick script to test the performance of three different ways to do this (IN, EXISTS and LEFT JOIN) and the difference is staggering.

Using IN on a concatenated key for this test is in the order of 200 times slower.

The basic summary is that the technique:

select * from #performancetesttable tt
where
tt.field1 + tt.field2 not in (select tt2.field1 + tt2.field2 from #performancetesttable2 tt2)

is much much slower on Microsoft SQL Server 2008 than:

select * from #performancetesttable tt
left join #performancetesttable2 tt2 on tt.field1 + tt.field2 = tt2.field1 + tt2.field2
where tt2.field1 is null

even though in Microsoft SQL Server 2000 the performance is about the same.

The hardware difference on the two machines is greatly in the Microsoft SQL Server 2008 favour – being a dual six core machine with 32GB RAM as opposed to dual dual core machine with 4GB of RAM (only 2GB used by Microsoft SQL Server 2000)

Using IN – Microsoft SQL Server 2008 takes 20250 ms.
Using IN – Microsoft SQL Server 2000 takes 106 ms.

Using EXISTS – Microsoft SQL Server 2008 takes 103 ms.
Using EXISTS – Microsoft SQL Server 2000 takes 46 ms.

Using LEFT JOIN – Microsoft SQL Server 2008 takes 23 ms.
Using LEFT JOIN – Microsoft SQL Server 2000 takes 33 ms.

Read more for the SQL script to test this for yourself