I have a table containing 10 years worth of ‘package scans’. Somebody scans a package and it records the date and username. Let’s pretend for now that retaining 10 years of data actually has a purpose.
I have a page to show a summary for the past week, so clearly I only want to read 1 week’s worth of data.
Here’s the query, to be run in SSMS twice, once with a hardcoded recent date and again with an old date in 2013. It’s originally a parameterized query, but in SSMS I’m replacing
@p0 with the date:
SELECT (t0).(VerifyDate), (t0).(PackageId), (t0).(Username) FROM (dbo).(PackageVerification) AS (t0) INNER JOIN (dbo).(Package) AS (t1) ON (t1).(PackageId) = (t0).(PackageId) WHERE ((t1).(PackageStatus) <> 99) AND ((t0).(VerifyDate) > @p0) ORDER BY (t0).(VerifyDate) DESC
Before I execute it, I’d like to introduce my date index.
Now my date index is not on my
PackageVerification table, but instead is on a ‘helper view’ which performs the same join seen above. The query above is able to magically use this indexed view because I have SCHEMABINDING enabled.
CREATE NONCLUSTERED INDEX (IX_Helper_PackageVerification_USER_SCAN_HISTORY) ON (dbo).(Helper_PackageVerification) ( (VerifyDate) DESC, (PackageStatus) ASC ) INCLUDE ( (VerifyDateDate), (Username) )
When I run the query in SSMS with an old and new date it uses scan or seek as expected. The threshold seems to be somewhere around 2015. So anything remotely recent should definitely be using a seek. Here’s the results of that:
When I run it as a parameterized query from my application I always get a full scan, which for some reason uses a parallelized plan.
At least it’s using my helper index.
I’m actually not sure why I don’t get parameter sniffing for this. I always pass a very recent date so I would have thought it may have preferred a scan but I’m fine with it choosing the above plan given the circumstances. There’s a million+ rows and it takes about 150ms.
Incidentally this is a SQL Azure database with 2vCores. Parameter sniffing is enabled and parameterization is set to simple.
If I change the query and run my application using
OPTION (RECOMPILE) I do get the desired SEEK and a very good performance of just a few ms. The recompile time seems to be negligible and frankly this is perfectly fine performance I can use.
When I look in query store I can verify OPTION RECOMPILE uses the seek for a recent date, and scan for an old date! Awesome.
However, and I’ve never tried this before – I thought how about improving it even further with
OPTION (OPTIMIZE FOR @p0 = '4/1/2021').
I was expecting this to also use the seek, but without the need for recompilation every time. I’d just periodically change the date passed to OPTIMIZE FOR – maybe to the beginning of the previous month.
However, this is the query in the query store.
And it goes and does a full scan of all 1+ million rows when setting the date parameter to 4/7/21!
So now I’m lost. I’ve tried to read about everything I can on the subject but haven’t come across this issue. RECOMPILE works, but OPTIMIZE FOR doesn’t seem to do anything when I’m expecting it to effectively simulate running the query in SSMS with hardcoded values.