Saturday, October 26, 2013

The Special DUAL Table

DUAL is a special one-row, one-column table in Oracle’s data dictionary.

Of what use is DUAL?

DUAL comes in handy when you want to select just one row through a query. Oracle SQL structure requires you to have a FROM <table> clause, but some queries don’t need a table – if you want to know the current system date, for example, or the answer for (3+1)*5. DUAL is useful for queries you’d write for such cases:
1
select user from dual;
1
select (3+1)*5 from dual;

But why DUAL? Won’t any table with a rownum < 2 filter work equally well?

Yes, why not simply:
1
select user from scott.emp where rownum < 2;
This query gives you the same result but DUAL has an edge over any other table in the query – the Oracle optimizer recognizes DUAL as a special tableand prepares the best execution plan for it.

What will happen if I insert more rows or delete from DUAL?

The Oracle optimizer trusts DUAL to have exactly one row – no more, no less. If you mess up its data you can lead to unpredictable behavior in the database. Don’t try it!

If DUAL is supposed to have only one row, why is it called DUAL? Why not SINGLE?!

According to this old article on the history of Oracle’s DUAL table, DUAL was originally not meant to be seen itself but instead used inside a view that was expected to be queried. The idea was that you could do a JOIN to the DUAL table and create two rows in the result for every one row in your table. In that context, the name DUAL seemed fine.

How to find duplicate records in a table

duplicate-recordsA usual situation we come across in development: a table missed a unique key and allowed duplicate rows to be entered unchecked.  Now we want to find and delete those duplicates.
In this article, we’ll see how to achieve this.
Consider the table dup_emp with these values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SQL> select *
  from dup_emp
  order by empno;
 
     EMPNO ENAME                 SAL
---------- -------------- ----------
         1 Adam                  400
         2 Sandy                 300
         2 Ted                   450
         3 Mark                  450
         4 Alan                  450
         4 Carol                 200
         4 Peter                 250
         5 David                 350
 
8 rows selected.
As you can see, there are two entries for empno 2 and three entries for empno 4. We want each empno to correspond to a single employee only.
The query below finds all duplicate records i.e. all empnos with more than one entry in the table:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SQL> select *
  from
  3  (select d.*
  4        , count(*) over
  5          (partition by empno) cnt
  6   from dup_emp d
  7  )
  where cnt > 1;
 
     EMPNO ENAME   SAL        CNT
---------- ------ ---- ----------
         2 Sandy   300          2
         2 Ted     450          2
         4 Alan    450          3
         4 Carol   200          3
         4 Peter   250          3

Deleting the duplicate records

The usual need is to delete all but one such row. The first thing is to know the deciding factor -out of the duplicates, which one is to be retained?
The next SQL retains the row with the minimum rowid:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SQL> delete from dup_emp a
  where rowid >
  3    (select min(rowid)
  4     from dup_emp b
  5     where a.empno = b.empno);
 
3 rows deleted.
 
SQL> select *
  from dup_emp;
 
     EMPNO ENAME   SAL
---------- ------ ----
         1 Adam    400
         2 Sandy   300
         3 Mark    450
         4 Alan    450
         5 David   350
Change the WHERE condition according to your needs.