Good evening all!
I am looking to keep only the first observation following a date in a separate variable column. I want to keep only the observation in which the "RxStart" is the observation most immediately following the "firstDay" variable. I truly appreciate any and all help! Thank you!!
Here is an example of what I have:
ID | RxStart | RxEnd | firstDay |
1 | 18141 | 18171 | 18018 |
1 | 18079 | 18109 | 18018 |
1 | 17949 | 18064 | 18018 |
1 | 18079 | 18094 | 18018 |
1 | 18019 | 18033 | 18018 |
1 | 18141 | 18171 | 18018 |
2 | 18079 | 18109 | 17957 |
2 | 17929 | 17959 | 17957 |
2 | 17988 | 18018 | 17957 |
And here is what I would want:
ID | RxStart | RxEnd | firstDay |
1 | 18019 | 18033 | 18018 |
2 | 17988 | 18018 | 17957 |
Simple approach with sorting and a data step:
data have;
infile cards dlm='09'x;
input ID RxStart RxEnd firstDay;
cards;
1 18141 18171 18018
1 18079 18109 18018
1 17949 18064 18018
1 18079 18094 18018
1 18019 18033 18018
1 18141 18171 18018
2 18079 18109 17957
2 17929 17959 17957
2 17988 18018 17957
;
run;
proc sort data=have;
by id rxstart;
run;
data want;
set have;
by id;
retain flag;
if first.id then flag = 1;
if flag and rxstart >= firstday
then do;
output;
flag = 0;
end;
drop flag;
run;
proc print data=want noobs;
run;
Result:
Rx first ID Start RxEnd Day 1 18019 18033 18018 2 17988 18018 17957
Note how presenting example data in a data step makes it easy to recreate for testing (just copy/paste & submit).
data have;
input ID RxStart RxEnd firstDay;
cards;
1 18141 18171 18018
1 18079 18109 18018
1 17949 18064 18018
1 18079 18094 18018
1 18019 18033 18018
1 18141 18171 18018
2 18079 18109 17957
2 17929 17959 17957
2 17988 18018 17957
;
proc sql;
create table want(drop=m) as
select *, ifn(rxstart-firstday>0,rxstart-firstday,.) as m
from have
group by id
having m= min(m) ;
quit;
data want;
do until(last.id);
set have;
by id;
m=ifn(rxstart-firstday>0,rxstart-firstday,.) ;
m1=min(m,m1);
end;
do until(last.id);
set have;
by id;
m=ifn(rxstart-firstday>0,rxstart-firstday,.) ;
if m1=m then output;
end;
drop m:;
run;
Simple approach with sorting and a data step:
data have;
infile cards dlm='09'x;
input ID RxStart RxEnd firstDay;
cards;
1 18141 18171 18018
1 18079 18109 18018
1 17949 18064 18018
1 18079 18094 18018
1 18019 18033 18018
1 18141 18171 18018
2 18079 18109 17957
2 17929 17959 17957
2 17988 18018 17957
;
run;
proc sort data=have;
by id rxstart;
run;
data want;
set have;
by id;
retain flag;
if first.id then flag = 1;
if flag and rxstart >= firstday
then do;
output;
flag = 0;
end;
drop flag;
run;
proc print data=want noobs;
run;
Result:
Rx first ID Start RxEnd Day 1 18019 18033 18018 2 17988 18018 17957
Note how presenting example data in a data step makes it easy to recreate for testing (just copy/paste & submit).
Thank you so much for both solutions!! This worked like a charm. Appreciate your time and help.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.