Hello,
I have the following example data. I would like to write a conditional statement that encompasses the following:
If last.enrolid and med_name = N;
and lag(svcdate_new) gt svcdate_orig
then I would like to replace the lag(svcdate_new) with the svcdate_orig of the "N" medication.
The "N" medication will always be the last medication for each unique patient.
For the example below, since 7/12/14 (the lag svcdate_new) is greater than 6/19/14 svcdate_orig of the "N" medication, then I would like to replace the 7/12/14 with 6/19/14.
data have(label='Prescription Data');
/*length id $7;*/
infile datalines truncover;
input patient_id:$4. med_name:$10. svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
format svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 7/12/14
1 N 6/19/14 7/13/14
;
proc sort data = have;
by patient_id svcdate_orig;
run;
What I want the code to do (in red below):
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 6/19/14
1 N 6/19/14 7/13/14
Thank you for any insights you may have!
data have(label='Prescription Data');
/*length id $7;*/
infile datalines truncover;
input patient_id:$4. med_name:$10. svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
format svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 7/12/14
1 N 6/19/14 7/13/14
;
proc sort data = have;
by patient_id svcdate_orig;
run;
data temp;
set have ;
by patient_id;
if last.patient_id and med_name='N' and
lag(patient_id)=patient_id and lag(svcdate_new) gt svcdate_orig then
new=svcdate_orig;
format new mmddyy8.;
run;
data want(drop=_: new);
merge temp temp(firstobs=2 keep=patient_id new rename=(patient_id=_pid ));
if patient_id=_pid and not missing(new) then svcdate_new=new;
run;
Here is what I want the code to do (the above did not come out correctly!) - in red below
1 |
B |
1/13/14 |
1/13/14 |
1 |
B |
2/10/14 |
2/12/14 |
1 |
B |
2/16/14 |
2/22/14 |
1 |
B |
3/7/14 |
3/14/14 |
1 |
B |
4/3/14 |
4/13/14 |
1 |
B |
5/3/14 |
5/13/14 |
1 |
B |
5/27/14 |
6/12/14 |
1 |
B |
6/6/14 |
6/27/14 |
1 |
B |
6/16/14 |
6/19/14 |
1 |
N |
6/19/14 |
7/13/14 |
Hello @alaxman,
@alaxman wrote:
The "N" medication will always be the last medication for each unique patient.
If med_name="N" reliably implies last.patient_id (in a data step with by patient_id ...), you can use a look-ahead merge like this:
data want(drop=_:);
merge have have(firstobs=2 keep=patient_id med_name svcdate_orig
rename=(patient_id=_pid med_name=_mn svcdate_orig=_so));
if patient_id=_pid & _mn='N' & svcdate_new gt _so then svcdate_new=_so;
run;
Otherwise, you would need to look one observation further ahead to ensure that the med_name="N" observation is also the last of the patient:
data want(drop=_:); merge have have(firstobs=2 keep=patient_id med_name svcdate_orig rename=(patient_id=_pid med_name=_mn svcdate_orig=_so)) have(firstobs=3 keep=patient_id rename=(patient_id=_pid2)); if patient_id=_pid~=_pid2 & _mn='N' & svcdate_new gt _so then svcdate_new=_so; run;
data have(label='Prescription Data');
/*length id $7;*/
infile datalines truncover;
input patient_id:$4. med_name:$10. svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
format svcdate_orig:MMDDYY8. svcdate_new:MMDDYY8.;
Datalines;
1 B 1/13/14 1/13/14
1 B 2/10/14 2/12/14
1 B 2/16/14 2/22/14
1 B 3/7/14 3/14/14
1 B 4/3/14 4/13/14
1 B 5/3/14 5/13/14
1 B 5/27/14 6/12/14
1 B 6/6/14 6/27/14
1 B 6/16/14 7/12/14
1 N 6/19/14 7/13/14
;
proc sort data = have;
by patient_id svcdate_orig;
run;
data temp;
set have ;
by patient_id;
if last.patient_id and med_name='N' and
lag(patient_id)=patient_id and lag(svcdate_new) gt svcdate_orig then
new=svcdate_orig;
format new mmddyy8.;
run;
data want(drop=_: new);
merge temp temp(firstobs=2 keep=patient_id new rename=(patient_id=_pid ));
if patient_id=_pid and not missing(new) then svcdate_new=new;
run;
@Ksharp Thanks very much, your code works well! What if I wanted to add an additional condition where I also want the lag(svcdate_orig) to = 0. All other conditions are the same. Would you just add this requirement to this code or create a new data step accordingly? thanks so much!
You want this condition be 'AND' or 'OR' ?
Assuming you want 'OR' . Here is:
data temp;
set have ;
by patient_id;
if last.patient_id and med_name='N' and
lag(patient_id)=patient_id and
(lag(svcdate_new) gt svcdate_orig or lag(svcdate_orig)= 0 ) then
new=svcdate_orig;
format new mmddyy8.;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.