I have a set with multiple vars and two ID's: the first ID identifies the account owner, and the second ID identifies the user and records. EX:
a 1
a 1
a 1
a 2
a 2
a 3
a 3
b 1
b 1
b 2
.
z 5
I wish to collect all records with all account owners (a -....-z) and their FIRST user and records. EX:
a 1
a 1
a 1
b 1
b 1
etc.
Any thoughts on an efficient piece of code to do this?
Thanks
A non-sql version:
proc sort data = have ;
by id1 id2 ;
run ;
data want ;
set have ;
by id1 id2 ;
if first.id1 ;
run ;
how about:
data have;
input id1 $ id2;
cards;
a 1
a 1
a 1
a 2
a 2
a 3
a 3
b 1
b 1
b 2
;
proc sql;
create table want as
select * from have
group by id1
having id2=min(id2);
quit;
proc print;run;
Linlin
This worked REALLY well... Thanks!
Expanding on Steve's code.
data want ;
set have ;
by id1 id2 ;
retain _id1 _id2;
if first.id1 then do;
_id1 = id1;
_id2 = id2;
end;
if _id1 = id1 and _id2 = id2;
drop _id:;
run ;
If you really want the first user and not the user with the lowest id then Linlin's SQL won't give you the correct result. Some code like below would do:
data have;
input ida $ idb $;
datalines;
b 1
b 1
b 2
a 2
a 2
a 1
a 1
a 1
a 3
a 3
;
run;
data want(drop=_:);
set have;
by ida notsorted;
retain _r_idb;
if first.ida then _r_idb=idb;
if idb=_r_idb then output;
run;
A non-sql version:
proc sort data = have ;
by id1 id2 ;
run ;
data want ;
set have ;
by id1 id2 ;
if first.id1 ;
run ;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.