Wednesday, March 7, 2012

removing duplicate rows

Hi,

Please give the DML to SELECT the rows avoiding the duplicate rows. Since there is a text column in the table, I couldn't use aggregate function, group by (OR) DISTINCT for processing.

Table :

create table test(col1 int, col2 text)
go
insert into test values(1, 'abc')
go
insert into test values(2, 'abc')
go
insert into test values(2, 'abc')
go
insert into test values(4, 'dbc')
go

Please advise,

Thanks,
MiraJTry converting the columns to, for instance, varchar by using CAST() and then concatenating them to one string. Before going on with DISTINCT or whatever needed.|||I want the output to be in the same datatype. Hence casting will not solve my purpose.|||nabucco's suggestion works:

select cast(col2 as text) col2
from (
select distinct cast(col2 AS varchar(100)) col2 from test
) x

col2
--
abc
dbc|||nabucco's suggestion works:

select cast(col2 as text) col2
from (
select distinct cast(col2 AS varchar(100)) col2 from test
) x

col2
--
abc
dbc

How will u solve the issue if the text data is bigger than 8000 ?|||Just a thought ...

Can use substring and then concatenate every 8000 characters. Will it work !|||I'm afraid not. You can not concatenate text datatype

create table test(col1 int, col2 text)
go
insert into test values(1, 'abc')
go
select col2 + col2 from test

(1 row(s) affected)

Server: Msg 403, Level 16, State 1, Line 2
Invalid operator for data type. Operator equals add, type equals text.|||Go thru this sample script,it may help u.

Use Northwind
go
select * into t from Employees
go
--inserting duplicate records----
insert into t
(
LastName,
FirstName,
Title,
TitleOfCourtesy,
BirthDate,
HireDate,
Address,
City,
Region,
PostalCode,
Country,
HomePhone,
Extension,
Photo,
Notes,
ReportsTo,
PhotoPath
)
select
LastName,
FirstName,
Title,
TitleOfCourtesy,
BirthDate,
HireDate,
Address,
City,
Region,
PostalCode,
Country,
HomePhone,
Extension,
Photo,
Notes,
ReportsTo,
PhotoPath
from t where EmployeeID in (2,1,4,5,8)

go
--------delete duplicate------
-- Use this query to findout the maximum lenth in text field.here maximum is 21000 bytes, so I used substring function up to that level(eg:substring(Photo,16001,8000)) in query.

--SELECT max(datalength(Photo)) FROM Employees
while (0=0)
begin
delete t from t t1 inner join

(
select max(E1ID) as E1ID,max(E2ID) as E2ID,sumvalue from
(
SELECT
E1.EmployeeID as E1ID
,E2.EmployeeID as E2ID, (E1.EmployeeID +E2.EmployeeID) as sumvalue
FROM
t E1 INNER JOIN t E2 ON
substring(E1.Photo, 1, 8000) = substring(E2.Photo, 1, 8000)
AND substring(E1.Photo, 8001, 8000) = substring(E2.Photo, 8001, 8000)
AND substring(E1.Photo, 16001, 8000) = substring(E2.Photo, 16001, 8000)
) as tm
group by sumvalue
having count(*)>=2
) tm1
on tm1.E1ID=t1.EmployeeID
if @.@.rowcount=0 break
end
go
--Unique records----
select * from t

No comments:

Post a Comment