Hi,
I wan to delete duplicate rows using proc sql (it is possible by proc sort with nodupkey)
For ex - In the below example, i want to remove duplicates on the basis of name and age.
Input -
id name age company
1 aik 26 tcs
2 aik 29 infosys
3 bik 23 wns
4 bik 23 tcs
5 cik 30 infosys
6 cik 28 wns
Output -
id name age company
1 aik 26 tcs
2 aik 29 infosys
3 bik 23 wns
5 cik 30 infosys
6 cik 28 wns
data have; input id name $ age company $; cards; 1 aik 26 tcs 2 aik 29 infosys 3 bik 23 wns 4 bik 23 tcs 5 cik 30 infosys 6 cik 28 wns ; run; proc sql; select * from have group by name,age having id=min(id); quit;
Xia Keshan
Use Select Distinct to get unique records.
proc sql;
create table want as
select distinct id, name, age, company
from have;
quit;
Hi Reeza,
Your query will create distinct rows, i am looking for distinct name and age combination only.
Thanks
Nikunj
For duplicates do you care which company gets attached?
You can basically do a group by and summarize on those columns, a data step with first/last gives you more control.
proc sql;
create table want as
select min(id) as ID, name, age, min(company) as company
from have
group by name, age
order by id, name, age;
quit;
data have; input id name $ age company $; cards; 1 aik 26 tcs 2 aik 29 infosys 3 bik 23 wns 4 bik 23 tcs 5 cik 30 infosys 6 cik 28 wns ; run; proc sql; select * from have group by name,age having id=min(id); quit;
Xia Keshan
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.