In this article, we’ll see how to achieve this.
Consider the table dup_emp with these values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| SQL> select * 2 from dup_emp 3 order by empno; EMPNO ENAME SAL ---------- -------------- ---------- 1 Adam 400 2 Sandy 300 2 Ted 450 3 Mark 450 4 Alan 450 4 Carol 200 4 Peter 250 5 David 350 8 rows selected. |
As you can see, there are two entries for empno 2 and three entries for empno 4. We want each empno to correspond to a single employee only.
The query below finds all duplicate records i.e. all empnos with more than one entry in the table:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| SQL> select * 2 from 3 ( select d.* 4 , count (*) over 5 (partition by empno) cnt 6 from dup_emp d 7 ) 8 where cnt > 1; EMPNO ENAME SAL CNT ---------- ------ ---- ---------- 2 Sandy 300 2 2 Ted 450 2 4 Alan 450 3 4 Carol 200 3 4 Peter 250 3 |
Deleting the duplicate records
The usual need is to delete all but one such row. The first thing is to know the deciding factor -out of the duplicates, which one is to be retained?
The next SQL retains the row with the minimum rowid:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| SQL> delete from dup_emp a 2 where rowid > 3 ( select min (rowid) 4 from dup_emp b 5 where a.empno = b.empno); 3 rows deleted. SQL> select * 2 from dup_emp; EMPNO ENAME SAL ---------- ------ ---- 1 Adam 400 2 Sandy 300 3 Mark 450 4 Alan 450 5 David 350 |
Change the WHERE condition according to your needs.
This comment has been removed by the author.
ReplyDeleteThanks for the post :-) This Qs was asked to me in interview
ReplyDelete