Removing a server: removing duplicate rows

Wednesday, March 7, 2012

removing duplicate rows

I am not able to figure out a script that removes duplicates if for the same
SSNs, the startdate and a code is the same.
sample data
SSN startdate code
123456789 11-12-1980 2540
456789123 14-5-1965 1236
123456789 02-09-1980 7456
789456132 06-07-1950 2589
123456789 11-12-1980 2540
What I want deleted is here is :
123456789 11-12-1980 2540 (just one of the two records)
Any help would be great
thanksAs long as you don=B4t have any identifier which can differ between the
rows there are onl workarounds for that:
1=2E Include a identifier to differ (like an identity column), then use
the follwing code:
DELETE YourTable FROM YourTable T
INNER JOIN
(
SELECT identifiercolumn from YourTable
GROUP BY SSN,startdate,code
HAVING COUNT(*) > 1
) SuQUery
ON SubQuery.identifiercolumn =3D T.identifiercolumn
2=2E (Preferable one) Copy one of the duples to a temp table, delete all
duplicates and reinsert the data from the temp table.
3=2E Avoid duplicates !
HTH, Jens Suessmeyer.|||Jens
There is a sequence number in the table that is unique. But It will not
matter which one we delete I would prefer the number that has a greater valu
e
(as in the seqnumber). Also I am pretty new to programming in SQL so if
possible can u actually show me how to crerate an identifier Copy one of the
duples to a temp table, delete all duplicates and reinsert the data from the
temp table.And avoid duplicates !
Thanks for your help
Amit
"Jens" wrote:

> As long as you don′t have any identifier which can differ between the
> rows there are onl workarounds for that:
> 1. Include a identifier to differ (like an identity column), then use
> the follwing code:
> DELETE YourTable FROM YourTable T
> INNER JOIN
> (
> SELECT identifiercolumn from YourTable
> GROUP BY SSN,startdate,code
> HAVING COUNT(*) > 1
> ) SuQUery
> ON SubQuery.identifiercolumn = T.identifiercolumn
> 2. (Preferable one) Copy one of the duples to a temp table, delete all
> duplicates and reinsert the data from the temp table.
> 3. Avoid duplicates !
> HTH, Jens Suessmeyer.
>|||Hi There,
Having a table with all column values identical for two or more rows
means that there are no constraints . Are you trying to manage history
and usable data in the same table?
I Think this query might help you.
Delete from YourTableName Where RowID NOT IN (Select MIN(ROWID) From
YourTableName Group By FieldList,......)
The group by will have all the fields except ROWID (or Unique
identifier)
I still say Aviod duplicates and have constraints on the table . Manage
history and usable data in different table (if this is the case).
With Warm regards
Jatinder SIngh

Wednesday, March 7, 2012

removing duplicate rows

No comments:

Post a Comment

Removing a server

Blog Archive

About Me