Good evening all!
I am looking to keep only the first observation following a date in a separate variable column. I want to keep only the observation in which the "RxStart" is the observation most immediately following the "firstDay" variable. I truly appreciate any and all help! Thank you!!
Here is an example of what I have:
ID | RxStart | RxEnd | firstDay |
1 | 18141 | 18171 | 18018 |
1 | 18079 | 18109 | 18018 |
1 | 17949 | 18064 | 18018 |
1 | 18079 | 18094 | 18018 |
1 | 18019 | 18033 | 18018 |
1 | 18141 | 18171 | 18018 |
2 | 18079 | 18109 | 17957 |
2 | 17929 | 17959 | 17957 |
2 | 17988 | 18018 | 17957 |
And here is what I would want:
ID | RxStart | RxEnd | firstDay |
1 | 18019 | 18033 | 18018 |
2 | 17988 | 18018 | 17957 |
Simple approach with sorting and a data step:
data have;
infile cards dlm='09'x;
input ID RxStart RxEnd firstDay;
cards;
1 18141 18171 18018
1 18079 18109 18018
1 17949 18064 18018
1 18079 18094 18018
1 18019 18033 18018
1 18141 18171 18018
2 18079 18109 17957
2 17929 17959 17957
2 17988 18018 17957
;
run;
proc sort data=have;
by id rxstart;
run;
data want;
set have;
by id;
retain flag;
if first.id then flag = 1;
if flag and rxstart >= firstday
then do;
output;
flag = 0;
end;
drop flag;
run;
proc print data=want noobs;
run;
Result:
Rx first ID Start RxEnd Day 1 18019 18033 18018 2 17988 18018 17957
Note how presenting example data in a data step makes it easy to recreate for testing (just copy/paste & submit).
data have;
input ID RxStart RxEnd firstDay;
cards;
1 18141 18171 18018
1 18079 18109 18018
1 17949 18064 18018
1 18079 18094 18018
1 18019 18033 18018
1 18141 18171 18018
2 18079 18109 17957
2 17929 17959 17957
2 17988 18018 17957
;
proc sql;
create table want(drop=m) as
select *, ifn(rxstart-firstday>0,rxstart-firstday,.) as m
from have
group by id
having m= min(m) ;
quit;
data want;
do until(last.id);
set have;
by id;
m=ifn(rxstart-firstday>0,rxstart-firstday,.) ;
m1=min(m,m1);
end;
do until(last.id);
set have;
by id;
m=ifn(rxstart-firstday>0,rxstart-firstday,.) ;
if m1=m then output;
end;
drop m:;
run;
Simple approach with sorting and a data step:
data have;
infile cards dlm='09'x;
input ID RxStart RxEnd firstDay;
cards;
1 18141 18171 18018
1 18079 18109 18018
1 17949 18064 18018
1 18079 18094 18018
1 18019 18033 18018
1 18141 18171 18018
2 18079 18109 17957
2 17929 17959 17957
2 17988 18018 17957
;
run;
proc sort data=have;
by id rxstart;
run;
data want;
set have;
by id;
retain flag;
if first.id then flag = 1;
if flag and rxstart >= firstday
then do;
output;
flag = 0;
end;
drop flag;
run;
proc print data=want noobs;
run;
Result:
Rx first ID Start RxEnd Day 1 18019 18033 18018 2 17988 18018 17957
Note how presenting example data in a data step makes it easy to recreate for testing (just copy/paste & submit).
Thank you so much for both solutions!! This worked like a charm. Appreciate your time and help.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.