Hello,
I have the following example data. I would like to write a conditional statement that encompasses the following:
If last.enrolid and med_name = N;
and lag(svcdate_new) gt svcdate_orig
then I would like to replace the lag(svcdate_new) with the svcdate_orig of the "N" medication.
The "N" medication will always be the last medication for each unique patient.
For the example below, since 7/12/14 (the lag svcdate_new) is greater than 6/19/14 svcdate_orig of the "N" medication, then I would like to replace the 7/12/14 with 6/19/14.
data have(label='Prescription Data');
/*length id $7;*/
infile datalines truncover;
input patient_id:$4. med_name:$10. svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
format svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 7/12/14
1 N 6/19/14 7/13/14
;
proc sort data = have;
by patient_id svcdate_orig;
run;
What I want the code to do (in red below):
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 6/19/14
1 N 6/19/14 7/13/14
Thank you for any insights you may have!
data have(label='Prescription Data');
/*length id $7;*/
infile datalines truncover;
input patient_id:$4. med_name:$10. svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
format svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 7/12/14
1 N 6/19/14 7/13/14
;
proc sort data = have;
by patient_id svcdate_orig;
run;
data temp;
set have ;
by patient_id;
if last.patient_id and med_name='N' and
lag(patient_id)=patient_id and lag(svcdate_new) gt svcdate_orig then
new=svcdate_orig;
format new mmddyy8.;
run;
data want(drop=_: new);
merge temp temp(firstobs=2 keep=patient_id new rename=(patient_id=_pid ));
if patient_id=_pid and not missing(new) then svcdate_new=new;
run;
Here is what I want the code to do (the above did not come out correctly!) - in red below
1 |
B |
1/13/14 |
1/13/14 |
1 |
B |
2/10/14 |
2/12/14 |
1 |
B |
2/16/14 |
2/22/14 |
1 |
B |
3/7/14 |
3/14/14 |
1 |
B |
4/3/14 |
4/13/14 |
1 |
B |
5/3/14 |
5/13/14 |
1 |
B |
5/27/14 |
6/12/14 |
1 |
B |
6/6/14 |
6/27/14 |
1 |
B |
6/16/14 |
6/19/14 |
1 |
N |
6/19/14 |
7/13/14 |
Hello @alaxman,
@alaxman wrote:
The "N" medication will always be the last medication for each unique patient.
If med_name="N" reliably implies last.patient_id (in a data step with by patient_id ...), you can use a look-ahead merge like this:
data want(drop=_:);
merge have have(firstobs=2 keep=patient_id med_name svcdate_orig
rename=(patient_id=_pid med_name=_mn svcdate_orig=_so));
if patient_id=_pid & _mn='N' & svcdate_new gt _so then svcdate_new=_so;
run;
Otherwise, you would need to look one observation further ahead to ensure that the med_name="N" observation is also the last of the patient:
data want(drop=_:); merge have have(firstobs=2 keep=patient_id med_name svcdate_orig rename=(patient_id=_pid med_name=_mn svcdate_orig=_so)) have(firstobs=3 keep=patient_id rename=(patient_id=_pid2)); if patient_id=_pid~=_pid2 & _mn='N' & svcdate_new gt _so then svcdate_new=_so; run;
data have(label='Prescription Data');
/*length id $7;*/
infile datalines truncover;
input patient_id:$4. med_name:$10. svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
format svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 7/12/14
1 N 6/19/14 7/13/14
;
proc sort data = have;
by patient_id svcdate_orig;
run;
data temp;
set have ;
by patient_id;
if last.patient_id and med_name='N' and
lag(patient_id)=patient_id and lag(svcdate_new) gt svcdate_orig then
new=svcdate_orig;
format new mmddyy8.;
run;
data want(drop=_: new);
merge temp temp(firstobs=2 keep=patient_id new rename=(patient_id=_pid ));
if patient_id=_pid and not missing(new) then svcdate_new=new;
run;
@Ksharp Thanks very much, your code works well! What if I wanted to add an additional condition where I also want the lag(svcdate_orig) to = 0. All other conditions are the same. Would you just add this requirement to this code or create a new data step accordingly? thanks so much!
You want this condition be 'AND' or 'OR' ?
Assuming you want 'OR' . Here is:
data temp;
set have ;
by patient_id;
if last.patient_id and med_name='N' and
lag(patient_id)=patient_id and
(lag(svcdate_new) gt svcdate_orig or lag(svcdate_orig)= 0 ) then
new=svcdate_orig;
format new mmddyy8.;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.