Hello,
I'm trying to do some data manipulation where I ignore all other drug prescriptions once the patient has been prescribed the 'NALTREXONE HCL' drug and am stuck. This is the initial data:
Patient ID | svcdate | daysupp | prodnme |
123 | 9/29/14 | 7 | ZUBSOLV |
123 | 10/2/14 | 30 | ZUBSOLV |
123 | 10/9/14 | 30 | NALTREXONE HCL |
I do some data manipulation to begin with, where I recompute the Date based on Days of Supply. For example, in row #2, the Date when the patient should've gotten the drug refill should have been Date (from row #1) + Days of Supply (from row #1) and hence the Date in row #2 should be (9/29/14 + 7) = 10/6/14. I'm able to do this successfully and the data looks like:
Patient ID | svcdate | daysupp | prodnme |
123 | 9/29/14 | 7 | ZUBSOLV |
123 | 10/6/14 | 30 | ZUBSOLV |
123 | 10/9/14 | 30 | NALTREXONE HCL |
As the final step, I want to re-compute the Days of Supply for row #2 based on when the patient was prescribed the new drug. For example, the patient was prescribed 'NALTREXONE HCL' on 10/9/14 which means that they only took ZUBSOLV from 10/6/14 to 10/9/14. Hence, I want to recompute the Days of Supply for row #2 as Date (row #3) - Date (row #2) which is 10/9/14 - 10/6/14 = 3 and my final data to look like:
Patient ID | svcdate | daysupp | prodnme |
123 | 9/29/14 | 7 | ZUBSOLV |
123 | 10/6/14 | 3 | ZUBSOLV |
123 | 10/9/14 | 30 | NALTREXONE HCL |
This is the code I have and it is not working:
data pers_3M_12M_30DAY;
set pers_3M_12M_30DAY;
if prodnme in ("VIVITROL", "REVIA", "NALTREXONE HCL", "NALTREXONE")
and svcdate_new le lag(svcdate_new)+ lag(daysupp)
then lag(daysupp) = svcdate_new - lag(svcdate_new);
run;
Thanks!
If I'm understanding you correctly, you're wanting to update the 2nd record based on the contents of the 3rd record. As far as I'm aware, the LAG function won't let you go back and update record 2 when you've already gone ahead to record 3.
There are (at least) a couple of ways to address this:
1. You can save each record in RETAINed fields (or a Hash table) and not write that record out until after you read the following record. This is referred to as Deferred Output processing. It's a bit of a pain, but it does work.
2. You can read the next (third) record while processing the second record. This is referred to as Read Ahead processing. @LeonidBatkhan has a good post on the subject: https://blogs.sas.com/content/sgf/2017/11/01/hopping-for-the-best-calculations-across-sas-dataset-ob...
If it were me, I'd probably go with #2. Deferred Output is a little kludgey.
Why don't you give one of those a shot? If those ideas don't help, post a reply, and let's see what we can do.
Jim
Like this?
data HAVE;
input PATIENT_ID SVCDATE mmddyy8. DAYSUPP PRODNME :& $20. ;
cards;
123 09/29/14 7 ZUBSOLV
123 10/02/14 30 ZUBSOLV
123 10/09/14 30 NALTREXONE HCL
run;
data WANT;
merge HAVE
HAVE(firstobs=2 rename=(PATIENT_ID=P SVCDATE=S DAYSUPP=D PRODNME=N));
SVCDATE_NEW=max(SVCDATE, lag(SVCDATE)+ lag(DAYSUPP));
if PATIENT_ID=P & N='NALTREXONE HCL' then DAYSUPP_NEW=min(DAYSUPP, S-SVCDATE_NEW);
format SVCDATE: date9.;
run;
Obs | PATIENT_ID | SVCDATE | DAYSUPP | PRODNME | SVCDATE_NEW | DAYSUPP_NEW |
---|---|---|---|---|---|---|
1 | 123 | 29SEP2014 | 7 | ZUBSOLV | 29SEP2014 | . |
2 | 123 | 02OCT2014 | 30 | ZUBSOLV | 06OCT2014 | 3 |
3 | 123 | 09OCT2014 | 30 | NALTREXONE HCL | 01NOV2014 | . |
> this code won't work
I don't understand what this means. The code and the result are as I ran them.
If I'm understanding you correctly, you're wanting to update the 2nd record based on the contents of the 3rd record. As far as I'm aware, the LAG function won't let you go back and update record 2 when you've already gone ahead to record 3.
There are (at least) a couple of ways to address this:
1. You can save each record in RETAINed fields (or a Hash table) and not write that record out until after you read the following record. This is referred to as Deferred Output processing. It's a bit of a pain, but it does work.
2. You can read the next (third) record while processing the second record. This is referred to as Read Ahead processing. @LeonidBatkhan has a good post on the subject: https://blogs.sas.com/content/sgf/2017/11/01/hopping-for-the-best-calculations-across-sas-dataset-ob...
If it were me, I'd probably go with #2. Deferred Output is a little kludgey.
Why don't you give one of those a shot? If those ideas don't help, post a reply, and let's see what we can do.
Jim
Hey Jim,
I'm not an expert in SAS. Would you be able to provide an example?
Excellent! Good job!
The hop macro might be a bit of overkill for this application, but it's a nice piece of code to having a working example of. You never know when it might be handy to have a read ahead.
I think the issue is that @alaxman wanted record two on the original file in effect updated in place. I don't think he wanted any additional rows or columns, at least that's my understanding.
While @alaxman was sorting out the hop macro, I coded an alternative (see code below my "signature"). Basically, I figure out what the revisions should be, create a little dataset to contain said revisions, and then re-join the revisions dataset with the original, updating row 2 which gives me the following results which I think is what @alaxman was after. Note that DaySupp on record 2 has been updated.
My code is below. Note that this code is a bit on the primitive side. It doesn't for example (yet) take into account that there might be more than one Patient_ID present in the data. I'll leave the code as is in case someone might find an alternative technique interesting (and since I've already written it), but my main point is the results set, above, with the updated DaySupp on row 2.
Jim
DATA Drug_Data;
INFILE Datalines DSD DLM='09'x MISSOVER;
INPUT Patient_ID $
SvcDate : ANYDTDTE8.
DaySupp
ProdNme : $25.
;
FORMAT SvcDate mmddyys10.;
Datalines;
123 9/29/14 7 ZUBSOLV
123 10/2/14 30 ZUBSOLV
123 10/9/14 30 NALTREXONE HCL
;
RUN;
DATA Partial_Update_Drug_Data Revisions(KEEP=Patient_ID SvcDate DaySupp);
DROP _:;
SET Drug_Data;
LENGTH _Prior_ID $8;
LENGTH _Prior_Drug $25.;
RETAIN _Prior_ID ' ';
RETAIN _Prior_Date 0;
RETAIN _Prior_Supp 0;
RETAIN _Prior_Drug ' ';
LENGTH _Save_Date 8;
LENGTH _Save_Supp 8;
IF ProdNme = _Prior_Drug THEN
DO;
SvcDate = INTNX('DAYS', _Prior_Date, _Prior_Supp);
END;
ELSE
IF ProdNme IN ("VIVITROL", "REVIA", "NALTREXONE HCL", "NALTREXONE") THEN
DO;
_Save_Date = SvcDate;
_Save_Supp = DaySupp;
DaySupp = INTCK('DAYS', _Prior_Date, SvcDate);
IF DaySupp ^= _Prior_Supp THEN
DO;
SvcDate = _Prior_Date;
OUTPUT Revisions;
SvcDate = _Save_Date;
DaySupp = _Save_Supp;
END;
END;
_Prior_ID = Patient_ID;
_Prior_Date = SvcDate;
_Prior_Supp = DaySupp;
_Prior_Drug = ProdNme;
OUTPUT Partial_Update_Drug_Data;
RUN;
PROC SQL NOPRINT;
CREATE TABLE Full_Update_Drug_Data AS
SELECT Data.Patient_ID
,Data.SvcDate
,COALESCE(Rev.DaySupp, Data.DaySupp) AS DaySupp
,Data.ProdNme
FROM Partial_Update_Drug_Data Data
LEFT JOIN Revisions Rev
ON Data.Patient_ID = Rev.Patient_ID
AND Data.SvcDate = Rev.SvcDate
;
QUIT;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.