Actual Difference Between EXISTS(SELECT 1 …), EXISTS(SELECT * …), and EXISTS(SELECT column …)

(last updated: 2020-01-25 @ 12:45 EST / 2020-01-25 @ 17:45 UTC )

Many of us SQL Server professionals are aware that it’s a “best practice” to avoid "SELECT * ..." queries. If you were not aware of this before, you are now (if you want to better understand why, there are plenty of articles / posts out there explaining it). Within the context of this advice there is a debate as to whether or not "SELECT * ..." should be avoided in an EXISTS operator. The theory is that the EXISTS operator is looking at rows instead of columns and isn’t bring back any data from any column(s), so here it should be ok to use "SELECT * ...". In trying to avoid the use of "*", people get creative and instead specify expressions such as:

  • 1
  • NULL
  • TOP 1 1 (this one seems like an odd choice given that the benefit of the EXISTS operator is that it stops processing at the first row it encounters)
  • {column_name}
  • others…

Several (many?) others have written about how the evidence seems to indicate that what you specify in the SELECT list within the EXISTS operator is not truly evaluated, so using "*" should be fine, or even preferred. Two such articles are:

Both of those authors do an excellent job showing that the EXISTS operator ignores the items in the SELECT list:

  1. They both show that the execution plans look the same.
  2. Brad also shows that invalid expressions (i.e. expressions that should generate an error, such as "1 / 0") do not throw any errors.
  3. Viacheslav also compares the performance between the various options and finds no difference among them.

In the past, I myself have preferred to use the “divide-by-zero” expression "1 / 0" as proof that there is no reason to not use "*". But, revisiting the issue now, it occurs to me that all of these proofs are really only inferring that "*" and "1" (and "TOP 1", "1 / 0", etc) are being handled the same way by SQL Server. To be fair, seeing that the execution plans look the same is pretty convincing, and is really close to being direct proof. But, what if two execution plans look the same, but have slight differences that are not visible in the graphical execution plans? I’m not sure how possible or likely that is.

Thankfully, we don’t need to wonder about any of this. We can get SQL Server to tell us, directly, that all of these variations are indeed the same, as far as it’s concerned.

What can SQL Server show us?

Methodology

How do we know whether or not the execution plans are truly the same? By inspecting the execution plan XML.

The execution plan XML has the following structure:

<ShowPlanXML>
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple ...
           ...  >

Within the <StmtSimple> element / node, there are several attributes, but the two we are interested in are:

  • QueryHash
  • QueryPlanHash

If two queries have the same QueryHash, then they are considered to be the exact same query. The expectation is that since each variation is slightly different text, each test should have a different value for QueryHash (a query’s hash is based on the actual bytes of the query text, hence even a difference in casing — "A" vs "a" — would result in a different hash value).

Likewise, if two execution plans have the same QueryPlanHash value, then they are considered to be the exact same execution plan.

Make Testing a Little Easier

In order to view the execution plan XML, after each execution we will need to:

  1. click the “Execution plan” tab
  2. right-click in the “Execution plan” area (to bring up the context menu)
  3. move the mouse cursor to “Show Execution Plan XML” and click on it

Not complicated, but doing this repeatedly over many tests gets a little but annoying. Fortunately, we can create a shortcut key for “Show Execution Plan XML” such that we will only need to click on the “Execution plan” tab (i.e. Step 1) and then hit the shortcut key. It would be nice to also skip Step 1, but there doesn’t seem to be a way to do that. For general instructions on how to create the shortcut key, please see my post:

SSMS Tip #2: Shortcut Keys for SQLCMD Mode, Open Containing Folder, Copy Full Path, etc

For this particular shortcut:

  • the command to find is: OtherContextMenus.ExecutionPlanContext.ShowExecutionPlanXML
  • create the shortcut either in “Global” (the default), or in “SQL Query Editor” (if you want to use the same shortcut in another context to do something else)
  • shortcut keys that I used (was previously unused): Control +

Are We Going to Test Anything or What?

Ok, fine. Let’s get on with it.

Initial Setup

USE [tempdb];

-- DROP TABLE dbo.[Nuthin];
CREATE TABLE dbo.Nuthin ([a] INT, [b] INT);
INSERT INTO dbo.Nuthin ([a], [b]) VALUES (999, -999);

We added a row to the table to make sure that the query optimizer doesn’t skip anything as an optimization if it knows that there are no rows and hence the EXISTS is guaranteed to return false.

In order to view the execution plans, you must enable “Include Actual Execution Plan”:

  • right-click in the query editor and select “Include Actual Execution Plan” from the context menu

    OR

  • type Control + M

Testing, Part 1

The first test is a commonly used proof that the SELECT list is not evaluated as it should produce a “divide-by-zero” error as shown here:

SELECT 1/0 FROM sys.objects;
/*
Msg 8134, Level 16, State 1, Line XXXXX
Divide by zero error encountered.
*/

Yet, as you will see, no error is produced. For each of the tests, I viewed the execution plan XML, copied the QueryHash and QueryPlanHash values, and pasted them below the query. In each case, the QueryHash value is different (as expected), but the QueryPlanHash value is the same. This proves that there is no functional difference between any of these options (meaning, "SELECT 1 ..." is not better than "SELECT * ..." in this context because they are the same, as far as SQL Server is concerned, and that’s the only opinion that matters here).

SELECT 'bob' WHERE EXISTS (SELECT 1/0 FROM dbo.Nuthin);
-- QueryHash="0xABC20B0FB2DF9C3E" QueryPlanHash="0xA4D6E870A12967AA"

SELECT 'bob' WHERE EXISTS (SELECT * FROM dbo.Nuthin);
-- QueryHash="0x91691B2FCD26DBD9" QueryPlanHash="0xA4D6E870A12967AA"

SELECT 'bob' WHERE EXISTS (SELECT TOP 1 1 FROM dbo.Nuthin);
-- QueryHash="0xFA5DEF9D8794F783" QueryPlanHash="0xA4D6E870A12967AA"

SELECT 'bob' WHERE EXISTS (SELECT (SELECT 1) FROM dbo.Nuthin);
-- QueryHash="0xAC803197792327AF" QueryPlanHash="0xA4D6E870A12967AA"

SELECT 'bob'
WHERE EXISTS (SELECT CASE
                       WHEN DATEDIFF(DAY, GETDATE(), '2020-01-22') > 3
                         THEN 22
                       ELSE 33
                     END FROM dbo.Nuthin);
-- QueryHash="0xA679E8825CC576BD" QueryPlanHash="0xA4D6E870A12967AA"

Does the "*" get expanded into the column list?

SELECT 'bob'
WHERE EXISTS (
     SELECT *
     FROM (SELECT *, 1/0 AS [nope]
           FROM   dbo.Nuthin) y
          );
-- QueryHash="0x26C7EEF1114519EC" QueryPlanHash="0xA4D6E870A12967AA"

The query above indicates that even if "*" is being expanded, it’s not being done in such a way that evaluates any of the data or values for the columns.

Is the column list evaluated or verified in any way? Based on the query below, we can see that if a column is referenced, it must at least exist:

SELECT 'bob' WHERE EXISTS (SELECT [c] FROM dbo.[Nuthin]);
/*
Data Provider: Msg 207, Level 16, State 1, Line XXXXX
Invalid column name 'c'
*/

Permissions?

Richard.Bielawski, in a forum post related to Viacheslav’s article, said:

If you don’t have access to every single column in the table then any value except a specific column that you do have access to will cause a security failure.
 
All generic values such as *, 1 or null cause all columns to be checked. If you name a specific column, that’s the only column that will be checked.

Is that true? To test, we will create a restricted User that can SELECT from the Table, but is denied access to one particular column.

CREATE USER [Denied] WITHOUT LOGIN;
GRANT SELECT ON dbo.[Nuthin] TO [Denied];
DENY SELECT ON dbo.[Nuthin]([b]) TO [Denied];

The restricted User cannot view execution plans, so disable those by typing Control + M one more time.

We then impersonate that restricted User:

EXECUTE AS USER = N'Denied';
SELECT CURRENT_USER AS [User];
-- Denied

We verify that the restricted User cannot SELECT column "[b]", but does have permission to read column "[a]":

SELECT * FROM dbo.[Nuthin];
SELECT 1 FROM dbo.[Nuthin];
SELECT NULL FROM dbo.[Nuthin];
/*
Msg 230, Level 14, State 1, Line XXXXX
The SELECT permission was denied on the column 'b' of the object 'Nuthin',
     database 'tempdb', schema 'dbo'.
*/
SELECT [a] FROM dbo.[Nuthin];
-- 999  (success)

Great. Now we see if column permissions are checked or not:

SELECT 'bob' WHERE EXISTS (SELECT * FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT NULL FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT 1 FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT 1/0 FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT [b] FROM dbo.[Nuthin]);
/*
Msg 230, Level 14, State 1, Line XXXXX
The SELECT permission was denied on the column 'b' of the object 'Nuthin',
  database 'tempdb', schema 'dbo'.
*/

SELECT 'bob' WHERE EXISTS (SELECT [a] FROM dbo.[Nuthin]);
-- bob  (success)

The queries shown above prove that permissions are indeed being checked for all columns when no columns are referenced.

BUT, there’s more to the story. It seems that the generic options do still work, just so long as you reference a column that the User has permissions to somewhere in the query:

SELECT 'bob' WHERE EXISTS (SELECT *, [a] FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT * FROM dbo.[Nuthin] WHERE [a] = 0);

SELECT 'bob' WHERE EXISTS (SELECT NULL, [a] FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT NULL FROM dbo.[Nuthin] WHERE [a] = 0);

SELECT 'bob' WHERE EXISTS (SELECT 1, [a] FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT 1 FROM dbo.[Nuthin] WHERE [a] = 0);

SELECT 'bob' WHERE EXISTS (SELECT 1 / 0, [a] FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT 1 / 0 FROM dbo.[Nuthin] WHERE [a] = 0);

And, just to be clear, referencing the restricted column in the SELECT list, even if a permitted column is also referenced in the subquery, will raise an error:

SELECT 'bob' WHERE EXISTS (SELECT [b], [a] FROM dbo.[Nuthin]);
SELECT 'bob' WHERE EXISTS (SELECT [b] FROM dbo.[Nuthin] WHERE [a] = 0);
/*
Msg 230, Level 14, State 1, Line XXXXX
The SELECT permission was denied on the column 'b' of the object 'Nuthin',
  database 'tempdb', schema 'dbo'.
*/

Hence, while it is true that column permissions are sometimes checked and can be a problem, it is fairly easy to avoid. And, on a practical level, how often do the queries being used within an EXISTS operator not reference any columns at all? Meaning, this is definitely something to be aware of, but possibly not something that most people will encounter.

Please note that this behavior is not specific to the EXISTS operator given that the following two statements succeed:

SELECT 1, [a] FROM dbo.[Nuthin];
SELECT 1 FROM dbo.[Nuthin] WHERE [a] = 0
SELECT NULL, [a] FROM dbo.[Nuthin];
SELECT NULL FROM dbo.[Nuthin] WHERE [a] = 0

Now we need to stop impersonating the restricted User:

REVERT;
SELECT CURRENT_USER AS [User];
-- dbo

One More Thing

We are no longer impersonating the restricted User, so we can re-enable retrieving actual execution plans by typing Control + M.

We’re almost done, but I did notice something else in a few of the execution plans that was interesting. Remember how the QueryHash value will be different if even a single byte of the query text is different? Well, as you can see below, those are definitely three different queries, yet they all have the exact same QueryHash value:

SELECT 'bob' WHERE EXISTS (SELECT [a] FROM dbo.[Nuthin]);
-- StatementText="SELECT 'bob' WHERE EXISTS (SELECT [a] FROM dbo.[Nuthin])"
--       QueryHash="0x17771D953AFFBFA7" QueryPlanHash="0xA4D6E870A12967AA"

SELECT 'bob' WHERE EXISTS (SELECT [b] FROM dbo.[Nuthin]);
-- StatementText="SELECT 'bob' WHERE EXISTS (SELECT [b] FROM dbo.[Nuthin])"
--       QueryHash="0x17771D953AFFBFA7" QueryPlanHash="0xA4D6E870A12967AA"

SELECT 'bob' WHERE EXISTS (SELECT [a], [b] FROM dbo.[Nuthin]);
-- StatementText="SELECT 'bob' WHERE EXISTS (SELECT [a], [b]
--                FROM dbo.[Nuthin])"
--       QueryHash="0x17771D953AFFBFA7" QueryPlanHash="0xA4D6E870A12967AA"

Now that’s not something you see every day 😺.

Final Thought

My preference is to use "SELECT *" as it makes debugging easier. Assumming there are no variables being used in the EXISTS subquery, I can highlight the entire subquery within the "EXISTS(...)" and hit F5 or Control + E to execute it to see what, if anything, is returned. If it’s a "1" or some other constant then, if any rows are returned, they are all of that constant, which doesn’t tell me nearly as much as the actual column values. And, if it’s an invalid expression, then it doesn’t work at all and can’t even clue me into whether or not rows are returned.

7 thoughts on “Actual Difference Between EXISTS(SELECT 1 …), EXISTS(SELECT * …), and EXISTS(SELECT column …)”

  1. Nice writing, thanks.

    The only, but important news for me was about column security checking.

    BTW, I think for an EXISTS subquery the security on the columns should not be checked if the column is referenced (either implicitly by * or explicitly) only in the SELECT list as it is not sent back to the caller. So SELECT * should work if there’s just a single column I have SELECT permission on. When the column is referenced in a WHERE clause, that’s another story, there checking the security on the column is a requirement, I agree.

    1. Hi there, and thanks. Regarding the idea of SELECT * (or, I suppose, even SELECT {constant} not checking permissions on any columns no matter what: that would be more consistent for how EXISTS subqueries typically work, and would lead to less confusion. For clarity, I added some queries to the post to show that a) any reference of the restricted column causes an error, and b) the permissions behavior works the same even outside of an EXISTS operator (except for SELECT * as that does try to read from the restricted column). Still, if you feel strongly that SELECT * and SELECT {constant} shouldn’t check the column permissions when no permitted column is present in the query, or even that SELECT {restrictedColumn} should not error, you can submit it as a bug to https://feedback.azure.com/forums/908035-sql-server

  2. Awesome as always Solomon. I am glad to see you take on the column level permission aspect of this. Interesting that it checks column permissions in this situation.

    1. Thanks, Sean! And yes, I was a bit surprised about the column-level permissions checking behavior as well, but felt a little better after seeing that it was the same behavior outside of the EXISTS(), so there’s at least some consistency with that particular behavior, even if it’s inconsistent with how everything else works within an EXISTS().

  3. Great post! Since it’s been a couple years, you may have learned this by now. But a cool trick you can use is SET STATISTICS XML ON. This will output the actual execution plan after every statement as a dataset in XML. Much easier than all the clicking.

    1. Hey Chad. Thank you for that compliment and feedback. I’m glad you like the post. No, I actually had not thought of using SET STATISTICS XML ON;, though that is an interesting idea. I did try it out and perhaps I’m missing something, but I don’t see how that approach saves any clicks when doing the research. Clicking on the returned XML execution plan opens up a graphical execution plan, not the formatted XML. So, when returning the “actual execution plan” (as I described in the post), you need to click the “Execution plan” tab, whereas when using SET STATISTICS XML ON; (as you suggested) you need to click on the returned XML. Both actions take you to the same place: the graphical execution plan. Hence, they seem fairly equivalent, though it now gives me an idea for a SQLCLR stored procedure to execute a query batch and capture the XML execution plans (just as SSMS is currently doing when using “Include Actual Execution Plan”) and return them as a single result set that can be captured, or at least manipulated to extract whatever one wants from the XML (e.g. using .value() ).

      Take care, Solomon…

      P.S. If I am missing something and your approach truly is fewer clicks, please do let me know!

  4. Great article as always, Solomon. But I think the information about the query hash is not correct. Handles are supposed to be unique but hashes group similar queries together.

Leave a Reply to zikatoCancel reply