I have a set with multiple vars and two ID's: the first ID identifies the account owner, and the second ID identifies the user and records. EX:
a 1
a 1
a 1
a 2
a 2
a 3
a 3
b 1
b 1
b 2
.
z 5
I wish to collect all records with all account owners (a -....-z) and their FIRST user and records. EX:
a 1
a 1
a 1
b 1
b 1
etc.
Any thoughts on an efficient piece of code to do this?
Thanks
A non-sql version:
proc sort data = have ;
by id1 id2 ;
run ;
data want ;
set have ;
by id1 id2 ;
if first.id1 ;
run ;
how about:
data have;
input id1 $ id2;
cards;
a 1
a 1
a 1
a 2
a 2
a 3
a 3
b 1
b 1
b 2
;
proc sql;
create table want as
select * from have
group by id1
having id2=min(id2);
quit;
proc print;run;
Linlin
This worked REALLY well... Thanks!
Expanding on Steve's code.
data want ;
set have ;
by id1 id2 ;
retain _id1 _id2;
if first.id1 then do;
_id1 = id1;
_id2 = id2;
end;
if _id1 = id1 and _id2 = id2;
drop _id:;
run ;
If you really want the first user and not the user with the lowest id then Linlin's SQL won't give you the correct result. Some code like below would do:
data have;
input ida $ idb $;
datalines;
b 1
b 1
b 2
a 2
a 2
a 1
a 1
a 1
a 3
a 3
;
run;
data want(drop=_:);
set have;
by ida notsorted;
retain _r_idb;
if first.ida then _r_idb=idb;
if idb=_r_idb then output;
run;
A non-sql version:
proc sort data = have ;
by id1 id2 ;
run ;
data want ;
set have ;
by id1 id2 ;
if first.id1 ;
run ;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.