In our last article we said that in SQL queries, IN and EXISTS are interchangeable. Many of us assume therefore, that NOT IN and NOT EXISTS are also interchangeable.
A big mistake.
See how NOT IN and NOT EXISTS behave differently in this small example.
Writing a query to find the number of employees in emp table who are not managers. The logic used is: get count where the employee id is not present in as mgr id in the same table.
The NOT IN version:
1
2
3
4
5
6
7
8
9
10
11
12
| SQL> -- Using NOT IN SQL> -- Count employees who SQL> -- are not managers SQL> select count (*) 2 from emp e1 3 where e1.empno not in 4 ( select e2.mgr 5 from emp e2); COUNT (*) ---------- 0 |
OK, that says there are no such employees. Let’s try it with NOT EXISTS now.
The NOT EXISTS version:
1
2
3
4
5
6
7
8
9
10
11
12
13
| SQL> -- Using NOT EXISTS SQL> -- Count employees who SQL> -- are not managers SQL> select count (*) 2 from emp e1 3 where not exists 4 ( select 'Y' 5 from emp e2 6 where e2.mgr = e1.empno) COUNT (*) ---------- 8 |
Whoa! 8 such employees! Which answer is correct?
Let’s verify with the help of a flag against each employee – yes_mgr or no_mgr. We calculate this flag with help of the CASE statement.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| SQL> select e.empno 2 , e.mgr 3 , case when 4 ( select 'Y' from emp m 5 where m.mgr = e.empno 6 and rownum = 1) is null 7 then 'no_mgr' 8 else 'yes_mgr' 9 end is_mgr 10 from emp e; EMPNO MGR IS_MGR ---------- ---------- ------- 7369 7902 no_mgr 7499 7698 no_mgr 7521 7698 no_mgr 7566 7839 yes_mgr 7654 7698 no_mgr 7698 7839 yes_mgr 7782 7839 yes_mgr 7788 7566 yes_mgr 7839 yes_mgr 7844 7698 no_mgr 7876 7788 no_mgr 7900 7698 no_mgr 7902 7566 yes_mgr 7934 7782 no_mgr 14 rows selected. |
It turns out that there actually are eight employees who are flagged no_mgr, so the NOT EXISTS version is all right.
Why does NOT IN give a “wrong” result?!
Well, it isn’t a wrong result, that’s how it is supposed to work! The column ‘mgr’ in the emp table is nullable. As per Oracle’s treatment of NULLs:
- where 3 in (3, null) is TRUE as 3 is in (3).
- where 3 not in (1, null) is UNKNOWN as it is not known whether 3 is in null or not.
So, the NOT IN condition does not equate to true if the value being matched has nulls.
Conclusion
- NOT IN and NOT EXISTS are not identical if the subquery can potentially return nulls.
- Use NOT IN only if you are 100% certain that the subquery will not return NULLs.
No comments:
Post a Comment