Showing posts with label dml. Show all posts
Showing posts with label dml. Show all posts

Wednesday, March 7, 2012

removing duplicate rows

Hi,
Please give the DML to SELECT the rows avoiding the duplicate rows. Since there is a text column in the table, I couldn't use aggreate function, group by (OR) DISTINCT for processing.
Table :
create table test(col1 int, col2 text)
go
insert into test values(1, 'abc')
go
insert into test values(2, 'abc')
go
insert into test values(2, 'abc')
go
insert into test values(4, 'dbc')
go
Please advise,
Thanks,
Smithanot very efficient - and prone to possible truncation of col2 -

select distinct col1, cast(col2 as varchar(8000)) from test|||Thanks. I need the output to be with the same datatype, since I need to create temp tables using the selected data(using SELECT INTO)|||again not very efficient:

select temp.col1, cast(temp.newcol2 as text) col2
into newtable
from
(select distinct col1, cast(col2 as varchar(8000)) newcol2 from test) temp

removing duplicate rows

Hi,

Please give the DML to SELECT the rows avoiding the duplicate rows. Since there is a text column in the table, I couldn't use aggregate function, group by (OR) DISTINCT for processing.

Table :

create table test(col1 int, col2 text)
go
insert into test values(1, 'abc')
go
insert into test values(2, 'abc')
go
insert into test values(2, 'abc')
go
insert into test values(4, 'dbc')
go

Please advise,

Thanks,
MiraJTry converting the columns to, for instance, varchar by using CAST() and then concatenating them to one string. Before going on with DISTINCT or whatever needed.|||I want the output to be in the same datatype. Hence casting will not solve my purpose.|||nabucco's suggestion works:

select cast(col2 as text) col2
from (
select distinct cast(col2 AS varchar(100)) col2 from test
) x

col2
--
abc
dbc|||nabucco's suggestion works:

select cast(col2 as text) col2
from (
select distinct cast(col2 AS varchar(100)) col2 from test
) x

col2
--
abc
dbc

How will u solve the issue if the text data is bigger than 8000 ?|||Just a thought ...

Can use substring and then concatenate every 8000 characters. Will it work !|||I'm afraid not. You can not concatenate text datatype

create table test(col1 int, col2 text)
go
insert into test values(1, 'abc')
go
select col2 + col2 from test

(1 row(s) affected)

Server: Msg 403, Level 16, State 1, Line 2
Invalid operator for data type. Operator equals add, type equals text.|||Go thru this sample script,it may help u.

Use Northwind
go
select * into t from Employees
go
--inserting duplicate records----
insert into t
(
LastName,
FirstName,
Title,
TitleOfCourtesy,
BirthDate,
HireDate,
Address,
City,
Region,
PostalCode,
Country,
HomePhone,
Extension,
Photo,
Notes,
ReportsTo,
PhotoPath
)
select
LastName,
FirstName,
Title,
TitleOfCourtesy,
BirthDate,
HireDate,
Address,
City,
Region,
PostalCode,
Country,
HomePhone,
Extension,
Photo,
Notes,
ReportsTo,
PhotoPath
from t where EmployeeID in (2,1,4,5,8)

go
--------delete duplicate------
-- Use this query to findout the maximum lenth in text field.here maximum is 21000 bytes, so I used substring function up to that level(eg:substring(Photo,16001,8000)) in query.

--SELECT max(datalength(Photo)) FROM Employees
while (0=0)
begin
delete t from t t1 inner join

(
select max(E1ID) as E1ID,max(E2ID) as E2ID,sumvalue from
(
SELECT
E1.EmployeeID as E1ID
,E2.EmployeeID as E2ID, (E1.EmployeeID +E2.EmployeeID) as sumvalue
FROM
t E1 INNER JOIN t E2 ON
substring(E1.Photo, 1, 8000) = substring(E2.Photo, 1, 8000)
AND substring(E1.Photo, 8001, 8000) = substring(E2.Photo, 8001, 8000)
AND substring(E1.Photo, 16001, 8000) = substring(E2.Photo, 16001, 8000)
) as tm
group by sumvalue
having count(*)>=2
) tm1
on tm1.E1ID=t1.EmployeeID
if @.@.rowcount=0 break
end
go
--Unique records----
select * from t