Hi everyone
I have tried to remove duplicates from my data set using sql distinct, max, etc but unfortunately it did not work.
Have:
COD_A | COD_B |
---|---|
11002 | 200 |
11002 | 3010 |
11002 | 15400 |
Want:
COD_A | COD_B |
---|---|
11002 | 200 |
It doesn’t matter with cod B will be chose in table want, what I really need is to update from tree observations, to just one but keeping cod_b in the result.
I have tried many SQL solutions, like group by, distinct etc...
Thanks in advance
Regards
By datastep:
data have;
attrib cod_a cod_b format=$10.;
cod_a="11002"; codb="200"; output;
cod_a="11002"; codb="3010"; output;
cod_a="11002"; codb="15400"; output;
run;
proc sort data=have out=want;
by cod_a cod_b;
run;
data want;
set want;
by cod_a;
if first.cod_a then output;
run;
Have you tried:
proc sql;
create table want as
select distinct cod_a,min(cod_b) as cod_b
from have
;
quit;
Thanks it works!!!
By datastep:
data have;
attrib cod_a cod_b format=$10.;
cod_a="11002"; codb="200"; output;
cod_a="11002"; codb="3010"; output;
cod_a="11002"; codb="15400"; output;
run;
proc sort data=have out=want;
by cod_a cod_b;
run;
data want;
set want;
by cod_a;
if first.cod_a then output;
run;
Thanks! It works perfectly!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.