Hi everyone,
I have sample data below. I need to keep all the rows that have the first date for each ID. For example, ID 2 I want to keep both rows that have '01/25/2016' and ditch the other 2. My full data have any number of rows so this needs a general solution.
Any help is appreciated. Thanks!
DATA have;
input id date mmddyy10. var1 $;
format date mmddyy10.;
CARDS;
1 02/20/2016 A
1 02/25/2016 B
2 01/25/2016 A
2 01/25/2016 B
2 01/29/2016 B
2 02/04/2016 A
3 03/10/2016 A
3 03/10/2016 A
3 03/10/2016 B
3 03/22/2016 B
;
RUN;
DATA want;
input id date mmddyy10. var1 $;
format date mmddyy10.;
CARDS;
1 02/20/2016 A
2 01/25/2016 A
2 01/25/2016 B
3 03/10/2016 A
3 03/10/2016 A
3 03/10/2016 B
RUN;
You can simplify this.
data want;
set have;
by id;
retain fdate;
if first.id then fdate=date;
if date=fdate;
drop fdate;
run;
One way:
data want; set have; by id; retain fdate; if first.id then do; fdate=date; output; end; else if date=fdate then output; drop fdate; run;
BY group creates variables first. and last. data that are numeric 1/0 which can be used to test if the current record is the first or last of a group for each variable on the BY statement. The values are accessed using First.variablename or Last.variablename, pay attention to that dot, it is critical. So when the first value of ID for a group is encountered store the value of DATE into a Retained variable, Fdate in this case, that will persist across data step boundaries.
The explicit Output writes the First record for the ID and the Output in the Else clause only writes output to the set when the value of Date matches that of the retained value.
General hint: If your problem has anything to do with "first" anything than likely BY group processing with the First. or Last. variables
You can simplify this.
data want;
set have;
by id;
retain fdate;
if first.id then fdate=date;
if date=fdate;
drop fdate;
run;
proc sql;
create table want as
select * from have
group by id
having date = min(date)
;
quit;
DATA have;
input id date mmddyy10. var1 $;
format date mmddyy10.;
CARDS;
1 02/20/2016 A
1 02/25/2016 B
2 01/25/2016 A
2 01/25/2016 B
2 01/29/2016 B
2 02/04/2016 A
3 03/10/2016 A
3 03/10/2016 A
3 03/10/2016 B
3 03/22/2016 B
;
data want;
set have;
by id date notsorted;
if first.id then n=0;
n+first.date;
if n=1;
drop n;
run;
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.