I have a set with multiple vars and two ID's: the first ID identifies the account owner, and the second ID identifies the user and records. EX:
a 1
a 1
a 1
a 2
a 2
a 3
a 3
b 1
b 1
b 2
.
z 5
I wish to collect all records with all account owners (a -....-z) and their FIRST user and records. EX:
a 1
a 1
a 1
b 1
b 1
etc.
Any thoughts on an efficient piece of code to do this?
Thanks
A non-sql version:
proc sort data = have ;
by id1 id2 ;
run ;
data want ;
set have ;
by id1 id2 ;
if first.id1 ;
run ;
how about:
data have;
input id1 $ id2;
cards;
a 1
a 1
a 1
a 2
a 2
a 3
a 3
b 1
b 1
b 2
;
proc sql;
create table want as
select * from have
group by id1
having id2=min(id2);
quit;
proc print;run;
Linlin
This worked REALLY well... Thanks!
Expanding on Steve's code.
data want ;
set have ;
by id1 id2 ;
retain _id1 _id2;
if first.id1 then do;
_id1 = id1;
_id2 = id2;
end;
if _id1 = id1 and _id2 = id2;
drop _id:;
run ;
If you really want the first user and not the user with the lowest id then Linlin's SQL won't give you the correct result. Some code like below would do:
data have;
input ida $ idb $;
datalines;
b 1
b 1
b 2
a 2
a 2
a 1
a 1
a 1
a 3
a 3
;
run;
data want(drop=_:);
set have;
by ida notsorted;
retain _r_idb;
if first.ida then _r_idb=idb;
if idb=_r_idb then output;
run;
A non-sql version:
proc sort data = have ;
by id1 id2 ;
run ;
data want ;
set have ;
by id1 id2 ;
if first.id1 ;
run ;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.