Hi SAS Pros,
I am having this dataset. My goal is to know the dates that a patient used drug B while using A, which is the overlapping use, among all the prescriptions from 1/1/2017 to 4/30/2018.
ID | No_of_Prescription | Start_taking_Date | Day_Supply | last_taking_date | drug |
1 | 1 | 1/2/2017 | 3 | 1/4/2017 | A |
1 | 2 | 1/4/2017 | 30 | 2/2/2017 | B |
1 | 3 | 1/4/2017 | 30 | 2/2/2017 | A |
1 | 4 | 1/4/2017 | 30 | 2/2/2017 | A |
1 | 5 | 4/7/2017 | 10 | 4/16/2017 | B |
1 | 6 | 4/7/2017 | 10 | 4/16/2017 | A |
1 | 7 | 4/14/2017 | 30 | 5/13/2017 | A |
1 | 8 | 4/17/2017 | 30 | 5/16/2017 | A |
1 | 9 | 5/9/2017 | 30 | 6/7/2017 | B |
1 | 10 | 5/16/2017 | 30 | 6/14/2017 | A |
1 | 11 | 1/17/2018 | 60 | 3/17/2018 | B |
1 | 12 | 1/23/2018 | 30 | 2/21/2018 | A |
1 | 13 | 2/13/2018 | 30 | 3/14/2018 | A |
1 | 14 | 2/15/2018 | 30 | 3/16/2018 | B |
1 | 15 | 3/14/2018 | 30 | 4/12/2018 | A |
1 | 16 | 3/14/2018 | 30 | 4/12/2018 | A |
1 | 17 | 4/1/2018 | 30 | 4/30/2018 | B |
2 | 3 | 2/9/2017 | 30 | 3/10/2017 | B |
2 | 4 | 2/9/2017 | 30 | 3/10/2017 | B |
2 | 5 | 2/16/2017 | 30 | 3/17/2017 | A |
2 | 6 | 3/13/2017 | 30 | 4/11/2017 | B |
2 | 7 | 3/13/2017 | 30 | 4/11/2017 | B |
2 | 8 | 3/17/2017 | 30 | 4/15/2017 | A |
2 | 9 | 4/10/2017 | 30 | 5/9/2017 | B |
2 | 10 | 4/15/2017 | 30 | 5/14/2017 | A |
2 | 11 | 5/14/2017 | 352 | 4/30/2018 | B |
2 | 12 | 5/14/2017 | 30 | 6/12/2017 | A |
I know this project is a little complicated. I appreciate any help with it! (I have asked a similar question before, but this one is different, if you happened to see my previous post, please ignore the previous ones.)
Best regards,
C
Cynthia,
You are quite right about the ID issue. The program as it was would create data for ID's across a continuous range. If you only had ID's 1 and 13, the program would have created "dummy" data for non-existent ID's 2 through 12. Your idea of using a Hash table is exactly right -- and exactly how I solved the problem yesterday after last we spoke. See new code, below.
I have a new version of the program, and it will only process ID's that actually exist in the input data, using a hash table to control which ID's are processed. The new program is also adjusted to only consider three extra days after the Last_Taking_Date but not before the Start_Taking_Date. Please see the code below, and, below the code, the side by side Definition 1 and Definition 2 results.
I'm not sure what kind of macro variable problem you're having. Make sure you have the following in your code:
%LET Nte1 = NOTE:;
%LET Nte2 = NOTE-;
These are in my autoexec.sas and won't show up in the code I post below. If you're still having a problem after you make sure the above two %LET statements are in your code, then post the log messages. I would need to see exactly what SAS is complaining about.
The names of the drugs should not matter. The program will use whatever is in the data.
Jim
**------------------------------------------------------------------------------**;
** Debug option. Asterisk is no debugging. Blank for debugging. **;
%LET Db = ;
** Start producing results as of this month **;
%LET Start_Of_Period = '01JAN2017'd;
** Stop producing results as of this month **;
%LET End_Of_Period = '30APR2018'd;
** Extend the end of the overlap period by this number of days. **;
%LET Extra_Days = 3;
**------------------------------------------------------------------------------**;
*LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200921_M.xlsx";
LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200926_M.xlsx";
**------------------------------------------------------------------------------**;
DATA WORK.Rx_Dates;
SET DrugDate.'#LN00067'n;
RUN;
**------------------------------------------------------------------------------**;
DATA Rx_Period_Evaluated (KEEP=ID
Drug_Date Both)
ID_Table(KEEP=ID)
A_Dates (KEEP=ID Drug_Date Start_Taking_Date Last_Taking_Date)
B_Dates (KEEP=ID Drug_Date Start_Taking_Date Last_Taking_Date)
;
LENGTH ID 8;
LENGTH Drug_Date 8.;
FORMAT Drug_Date YYMMDDS10.;
FORMAT Save_Drug_Date YYMMDDS10.;
LENGTH End_of_Taking 8.;
FORMAT End_of_Taking MMDDYYS10.;
LENGTH Start_of_Period 8.;
FORMAT Start_of_Period MMDDYYS10.;
LENGTH End_of_Period 8.;
FORMAT End_of_Period MMDDYYS10.;
LENGTH Save_Start_Date 8.;
FORMAT Save_Start_Date MMDDYYS10.;
LENGTH Save_End_Date 8.;
FORMAT Save_End_Date MMDDYYS10.;
LENGTH Temp_Start_Date 8.;
FORMAT Temp_Start_Date MMDDYYS10.;
LENGTH Temp_End_Date 8.;
FORMAT Temp_End_Date MMDDYYS10.;
FORMAT A_Start_Date MMDDYYS10.;
FORMAT A_End_Date MMDDYYS10.;
FORMAT B_Start_Date MMDDYYS10.;
FORMAT B_End_Date MMDDYYS10.;
RETAIN Extra_Days 0;
**------------------------------**;
** Define hash tables. **;
**------------------------------**;
IF _N_ = 1 THEN
DO;
Extra_Days = &Extra_Days;
CALL MISSING (Drug_Date);
DECLARE HASH Hsh_ID_Table (ORDERED: 'A', MULTIDATA: 'N', DATASET: 'WORK.Rx_Dates');
Hsh_ID_Table.DefineKey ('ID');
Hsh_ID_Table.DefineData ('ID');
Hsh_ID_Table.DefineDone ();
DECLARE HITER Hit_ID_Table('Hsh_ID_Table');
DECLARE HASH Hsh_A_Dates (ORDERED: 'A', MULTIDATA: 'N');
Hsh_A_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_A_Dates.DefineData ('ID', 'Drug_Date', 'Start_Taking_Date', 'Last_Taking_Date');
Hsh_A_Dates.DefineDone ();
DECLARE HITER Hit_A_Dates('Hsh_A_Dates');
DECLARE HASH Hsh_B_Dates (ORDERED: 'A', MULTIDATA: 'N');
Hsh_B_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_B_Dates.DefineData ('ID', 'Drug_Date', 'Start_Taking_Date', 'Last_Taking_Date');
Hsh_B_Dates.DefineDone ();
DECLARE HITER Hit_B_Dates('Hsh_B_Dates');
ID_Cnt = Hsh_ID_Table.num_items;
&Db PUTLOG "&Nte1 The number of IDs loaded is " ID_Cnt=;
END;
**------------------------------**;
** Load drug A hash table. **;
**------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug A";
DO WHILE(NOT End_Of_A);
SET WORK.Rx_Dates
END = End_Of_A;
WHERE Drug = 'A';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_A=;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
LINK Chk_Dates;
_RC_A = Hsh_A_Dates.ADD();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
Needs_Update = 0;
LINK Replace_A;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
**------------------------------**;
** Load drug B hash table. **;
**------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug B";
DO WHILE(NOT End_Of_B);
SET WORK.Rx_Dates
END = End_Of_B;
WHERE Drug = 'B';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_B=;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
LINK Chk_Dates;
_RC_B = Hsh_B_Dates.ADD();
IF _RC_B > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
Needs_Update = 0;
LINK Replace_B;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
**--------------------------------------------------------------**;
** Initialize the start and end of the period of interest. **;
**--------------------------------------------------------------**;
Start_Of_Period = INTNX('MONTH', &Start_Of_Period, 0);
End_Of_Period = INTNX('MONTH', &End_Of_Period, 1) - 1;
**--------------------------------------------------------------**;
** Create SAS data sets from the hash tables for QC purposes. **;
**--------------------------------------------------------------**;
_RC_ID = Hit_ID_Table.FIRST();
DO WHILE (_RC_ID = 0);
OUTPUT ID_Table;
_RC_ID = Hit_ID_Table.NEXT();
END;
_RC_A = Hit_A_Dates.FIRST();
DO WHILE (_RC_A = 0);
OUTPUT A_Dates;
_RC_A = Hit_A_Dates.NEXT();
END;
_RC_B = Hit_B_Dates.FIRST();
DO WHILE (_RC_B = 0);
OUTPUT B_Dates;
_RC_B = Hit_B_Dates.NEXT();
END;
**--------------------------------------------------------------**;
** Evaluate Drug A and Drug B tables month by month by ID. **;
**--------------------------------------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Evaluating Drug A and Drug B tables";
_RC_ID = Hit_ID_Table.FIRST();
DO WHILE (_RC_ID = 0);
Drug_Date = Start_Of_Period;
DO WHILE(Drug_Date <= End_Of_Period);
Both = 0;
CALL MISSING(A_Start_Date, A_End_Date);
CALL MISSING(B_Start_Date, B_End_Date);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
A_Start_Date = Start_Taking_Date;
A_End_Date = Last_Taking_Date;
END;
ELSE
DO;
CALL MISSING(A_Start_Date, A_End_Date);
END;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
B_Start_Date = Start_Taking_Date;
B_End_Date = Last_Taking_Date;
END;
ELSE
DO;
CALL MISSING(B_Start_Date, B_End_Date);
END;
&Db PUTLOG "&Nte1 " ID= Drug_Date= End_Of_Period= _RC_A= _RC_B= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date=;
LINK Chk_Date_Match;
IF Extra_Days > 0 AND
Both = 0 THEN
DO;
IF _RC_A > 0 THEN
DO;
LINK Chk_Prior_A;
LINK Chk_Date_Match;
END;
IF _RC_B > 0 AND
Both = 0 THEN
DO;
LINK Chk_Prior_B;
LINK Chk_Date_Match;
END;
IF Both = 0 THEN
DO;
LINK Chk_Prior_A;
LINK Chk_Date_Match;
END;
IF Both = 0 THEN
DO;
LINK Chk_Prior_B;
LINK Chk_Date_Match;
END;
END;
*&Db PUTLOG "&Nte2 Both months not populated " ID= Drug_Date= End_Of_Period= Both= _RC_A= _RC_B=;
OUTPUT Rx_Period_Evaluated;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
_RC_ID = Hit_ID_Table.NEXT();
END;
STOP;
*********;
Replace_A:
*********;
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_A = Hsh_A_Dates.REPLACE();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
******;
RETURN;
******;
*********;
Replace_B:
*********;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_B = Hsh_B_Dates.REPLACE();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
******;
RETURN;
******;
*********;
Chk_Dates:
*********;
&Db PUTLOG "&Nte2 Chk_Dates Begin " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
IF Start_Taking_Date < Drug_Date THEN
Start_Taking_Date = Drug_Date;
IF Last_Taking_Date > INTNX('MONTH', Drug_Date, 1) - 1 THEN
Last_Taking_Date = INTNX('MONTH', Drug_Date, 1) - 1;
&Db PUTLOG "&Nte2 Chk_Dates End " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
******;
RETURN;
******;
**********;
Updt_Dates:
**********;
Temp_Start_Date = Start_Taking_Date;
Temp_End_Date = Last_Taking_Date;
IF Save_Start_Date < Start_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update start before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Start_Taking_Date = Save_Start_Date;
LINK Chk_Dates;
IF Start_Taking_Date < Temp_Start_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update start after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
IF Save_End_Date > Last_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update end before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Last_Taking_Date = Save_End_Date;
LINK Chk_Dates;
IF Last_Taking_Date > Temp_End_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update end after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
******;
RETURN;
******;
***********;
Chk_Prior_A:
***********;
** Save keys. **;
Save_Drug_Date = Drug_Date;
** Get start date from prior end date. **;
Drug_Date = INTNX('MONTH', Drug_Date, -1);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
&Db PUTLOG "&Nte2 Prior A found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date= Extra_Days=;
A_Start_Date = Last_Taking_Date + Extra_Days;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Prior A not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date=;
CALL MISSING(A_Start_Date);
END;
Drug_Date = Save_Drug_Date;
******;
RETURN;
******;
***********;
Chk_Prior_B:
***********;
** Save keys. **;
Save_Drug_Date = Drug_Date;
** Get start date from prior end date. **;
Drug_Date = INTNX('MONTH', Drug_Date, -1);
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
&Db PUTLOG "&Nte2 Prior B found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date= Extra_Days=;
B_Start_Date = Last_Taking_Date + Extra_Days;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Prior B not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date=;
CALL MISSING(B_Start_Date);
END;
Drug_Date = Save_Drug_Date;
******;
RETURN;
******;
**************;
Chk_Date_Match:
**************;
IF NOT NMISS(B_Start_Date, A_Start_Date, B_End_Date, Extra_Days) AND
(B_Start_Date <= A_Start_Date <= B_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. A_Start between B_Start and B_Stop " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(A_Start_Date, B_Start_Date, A_End_Date, Extra_Days) AND
(A_Start_Date <= B_Start_Date <= A_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. B_Start between A_Start and A_End " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(B_Start_Date, A_End_Date, B_End_Date, Extra_Days) AND
(B_Start_Date <= A_End_Date <= B_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. A_End between B_start and B_stop " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(A_Start_Date, B_End_Date, A_End_Date, Extra_Days) AND
(A_Start_Date <= B_End_Date <= A_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. B_End between A_Start and A_End " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 No match on dates " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date=;
END;
******;
RETURN;
******;
RUN;
**------------------------------------------------------------------------------**;
Hi, @CynthiaWei,
I'm not sure if I completely understood all of your requirements, but I have written a program that you should be able to adjust to fit as your needs dictate.
The program sets the "Both" variable to "1" for any month where both drug A and drug B were taken any time during the month. If either drug A and drug B were not taken any time during the month, the "Both" variable is set to "0."
Details of the program:
The code is below, and, following the code, are the results.
Jim
%LET Start_Of_Period = '01JAN2017'd;
%LET End_Of_Period = '30APR2018'd;
DATA Rx_Dates;
INFILE Datalines4
DLM = '09'x
FIRSTOBS = 2
DSD
;
INPUT
ID $
No_of_Prescription $
Start_Taking_Date : ANYDTDTE10.
Day_Supply $
Last_Taking_Date : ANYDTDTE10.
Drug $
;
FORMAT Start_Taking_Date MMDDYYS10.;
FORMAT Last_Taking_Date MMDDYYS10.;
DATALINES4;
ID No_of_Prescription Start_Taking_Date Day_Supply Last_Taking_Date Drug
1 1 1/2/2017 3 1/4/2017 A
1 2 1/4/2017 30 2/2/2017 B
1 3 1/4/2017 30 2/2/2017 A
1 4 1/4/2017 30 2/2/2017 A
1 5 4/7/2017 10 4/16/2017 B
1 6 4/7/2017 10 4/16/2017 A
1 7 4/14/2017 30 5/13/2017 A
1 8 4/17/2017 30 5/16/2017 A
1 9 5/9/2017 30 6/7/2017 B
1 10 5/16/2017 30 6/14/2017 A
1 11 1/17/2018 60 3/17/2018 B
1 12 1/23/2018 30 2/21/2018 A
1 13 2/13/2018 30 3/14/2018 A
1 14 2/15/2018 30 3/16/2018 B
1 15 3/14/2018 30 4/12/2018 A
1 16 3/14/2018 30 4/12/2018 A
1 17 4/1/2018 30 4/30/2018 B
2 3 2/9/2017 30 3/10/2017 B
2 4 2/9/2017 30 3/10/2017 B
2 5 2/16/2017 30 3/17/2017 A
2 6 3/13/2017 30 4/11/2017 B
2 7 3/13/2017 30 4/11/2017 B
2 8 3/17/2017 30 4/15/2017 A
2 9 4/10/2017 30 5/9/2017 B
2 10 4/15/2017 30 5/14/2017 A
2 11 5/14/2017 352 4/30/2018 B
2 12 5/14/2017 30 6/12/2017 A
;;;;
RUN;
**------------------------------------------------------------------------------**;
DATA Rx_Period_Evaluated;
KEEP ID Drug_Date Both;
LENGTH ID $8;
LENGTH Drug_Date 8.;
FORMAT Drug_Date MMDDYYS10.;
LENGTH End_of_Taking 8.;
FORMAT End_of_Taking MMDDYYS10.;
LENGTH Start_of_Period 8.;
FORMAT Start_of_Period MMDDYYS10.;
LENGTH End_of_Period 8.;
FORMAT End_of_Period MMDDYYS10.;
LENGTH Work_ID 8.;
RETAIN Max_ID 0;
IF _N_ = 1 THEN
DO;
CALL MISSING(Drug_Date);
DECLARE HASH Hsh_A_Dates();
Hsh_A_Dates.DefineKey('ID', 'Drug_Date');
Hsh_A_Dates.DefineDone();
CALL MISSING(Drug_Date);
DECLARE HASH Hsh_B_Dates();
Hsh_B_Dates.DefineKey('ID', 'Drug_Date');
Hsh_B_Dates.DefineDone();
END;
** Load drug A hash table. **;
DO WHILE(NOT End_Of_A);
SET Rx_Dates
END = End_Of_A;
WHERE Drug = 'A';
IF INPUT(ID, 3.) > Max_ID THEN
Max_ID = INPUT(ID, 3.);
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
DO WHILE(Drug_Date <= End_of_Taking);
_RC_A = Hsh_A_Dates.ADD();
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
** Load drug B hash table. **;
DO WHILE(NOT End_Of_B);
SET Rx_Dates
END = End_Of_B;
WHERE Drug = 'B';
IF INPUT(ID, 3.) > Max_ID THEN
Max_ID = INPUT(ID, 3.);
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
DO WHILE(Drug_Date <= End_of_Taking);
_RC_B = Hsh_B_Dates.ADD();
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
Start_Of_Period = INTNX('MONTH', &Start_Of_Period, 0);
End_Of_Period = INTNX('MONTH', &End_Of_Period, 1) - 1;
DO Work_ID = 1 TO Max_ID;
ID = STRIP(PUT(Work_ID, 3.));
Drug_Date = Start_Of_Period;
Iteration = 0;
DO WHILE(Drug_Date <= End_Of_Period);
Iteration + 1;
IF Hsh_A_Dates.FIND() = 0 AND
Hsh_B_Dates.FIND() = 0 THEN
DO;
Both = 1;
END;
ELSE
DO;
Both = 0;
END;
OUTPUT;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
STOP;
RUN;
**------------------------------------------------------------------------------**;
Results:
Hi Jim,
Thank you so much for your code. The result is exactly what I want to create.
I have tried your code and SAS reads the correct number of observation where drug='A' and drug='B', but has 0 observation for the final dataset. Following is what log provided, just so you know that I change dataset to a simpler one to have a easy way to find whether the code is working. I am attaching this dataset here below. Is it because I changed character/numeric variables?
SAS Log:
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
631:19 632:47 646:19 647:47
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
660:41
NOTE: There were 6 observations read from the data set WORK.TEST.
WHERE Drug='A';
NOTE: There were 7 observations read from the data set WORK.TEST.
WHERE Drug='B';
NOTE: The data set WORK.RX_PERIOD_EVALUATED has 0 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
Thank you so much!
Best regards,
C
Hi, @CynthiaWei,
Yes, you are correct. A few adjustments were required to accommodate a numeric ID. A numeric ID is actually easier to work with in this case and made the program a little simpler.
Here is the revised code:
%LET Start_Of_Period = '01JAN2017'd;
%LET End_Of_Period = '30APR2018'd;
LIBNAME DrugDate XLSX '.\SampleData\test_CW_20200921_M.xlsx';
DATA Rx_Dates;
SET DrugDate.'#LN00067'n;
RUN;
**------------------------------------------------------------------------------**;
DATA Rx_Period_Evaluated;
KEEP ID Drug_Date Both;
LENGTH ID 8;
LENGTH Drug_Date 8.;
FORMAT Drug_Date MMDDYYS10.;
LENGTH End_of_Taking 8.;
FORMAT End_of_Taking MMDDYYS10.;
LENGTH Start_of_Period 8.;
FORMAT Start_of_Period MMDDYYS10.;
LENGTH End_of_Period 8.;
FORMAT End_of_Period MMDDYYS10.;
RETAIN Max_ID 0;
IF _N_ = 1 THEN
DO;
CALL MISSING(Drug_Date);
DECLARE HASH Hsh_A_Dates();
Hsh_A_Dates.DefineKey('ID', 'Drug_Date');
Hsh_A_Dates.DefineDone();
CALL MISSING(Drug_Date);
DECLARE HASH Hsh_B_Dates();
Hsh_B_Dates.DefineKey('ID', 'Drug_Date');
Hsh_B_Dates.DefineDone();
END;
** Load drug A hash table. **;
DO WHILE(NOT End_Of_A);
SET Rx_Dates
END = End_Of_A;
WHERE Drug = 'A';
IF ID > Max_ID THEN
DO;
Max_ID = ID;
END;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
DO WHILE(Drug_Date <= End_of_Taking);
_RC_A = Hsh_A_Dates.ADD();
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
** Load drug B hash table. **;
DO WHILE(NOT End_Of_B);
SET Rx_Dates
END = End_Of_B;
WHERE Drug = 'B';
IF ID > Max_ID THEN
DO;
Max_ID = ID;
END;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
DO WHILE(Drug_Date <= End_of_Taking);
_RC_B = Hsh_B_Dates.ADD();
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
Start_Of_Period = INTNX('MONTH', &Start_Of_Period, 0);
End_Of_Period = INTNX('MONTH', &End_Of_Period, 1) - 1;
** Evaluate Drug A and Drug B tables month by month by ID. **;
DO ID = 1 TO Max_ID;
Drug_Date = Start_Of_Period;
DO WHILE(Drug_Date <= End_Of_Period);
_RC_A = Hsh_A_Dates.FIND();
_RC_B = Hsh_B_Dates.FIND();
IF Hsh_A_Dates.FIND() = 0 AND
Hsh_B_Dates.FIND() = 0 THEN
DO;
Both = 1;
END;
ELSE
DO;
Both = 0;
END;
OUTPUT;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
STOP;
RUN;
The results based on the new data are as follows:
I have manually checked the results, and they appear to be correct. If there are any apparent discrepancies, let's go over them.
Jim
Hi Jim,
I really appreciate your prompt reply a lot!
Your code is working this time!!! I just found a typo that the first prescription for ID=3 should be 8/3/2017 to 9/1/2017 other than 8/3/2018 to 9/1/2018. So sorry for this mistake. Did you use my second definition to calculate the variable Both, that is if drug A and drug B are taken even in 3 days interval (including the scenario when they are actually taken on the same day), it is still considered as an overlapping use of both A and B? If not, that is you are using the first definition, shouldn't it be all 0 for the 16 months period for ID=3?
Best regards,
C
I am using the first definition.
Essentially, I am going month by month. If drug A and drug B were both taken any time in a given month, then I am marking them as a "1".
It sounds like I need to go down to the day level and not just the month, yes?
And do you need definition 1 or definition 2? Or do you need a program that can switch back and forth?
Jim
Hi Jim,
I am sincerely grateful for your continuous advice and help on this project! I appreciate it a lot!
To get back to the question. Yes, we need to go down to the day level. For ID=3, the 1st prescription (drug B) from 8/3/2017 to 9/1/2017 (please update the typo, it should be 2017 but 2018) has No overlapping use with the 2nd prescription (drug A) from 9/3/2017 to 10/2/2017 according to the definition 1. Having both drug A and drug B taking any time in a given month doesn't necessarily make any overlapping use happened. Having both of them in any given day does.
However, it was an overlapping use based on the definition 2 since ID=3 started taking drug A (9/3/2017) within 3 days after the last day of taking drug B (9/1/2017). If the 2nd prescription (drug A) started on 9/5/2017 (not as it is right now on 9/3/2017), which is the 4th day after the last day of taking drug B, then it would NOT be considered as an overlapping use based on the definition 2.
Definition 1:
If interval<=0 day then overlapping use=Yes; if interval>=1 day then overlapping use=0. (Interval is the day difference in start_taking_date and last_taking_date between two different drugs)
Definition 2:
If interval<=3 day then overlapping use=Yes; if interval>=4 day then overlapping use=0
I need two programs, one program for the definition 1 and a separate program for the definition 2.
Best regards,
C
Hi, @CynthiaWei,
OK, good. I think I understand now.
So here are my results, in two sets. On the left is definition one, with no extra days. On the right, is definition two, with three extra days. Please see what you think.
By the way, these results are both from the same program. I just created a macro variable at the beginning of the program called "extra days." If you set extra days to 0, then you are using definition one. If you set extra days to 3, then you are using definition two. The program is a bit long, so I will post it separately, but if these results look good, you should create additional test data and try running the program.
There are some additional macro variables:
Db - Debug option. Use an '*' (asterisk) for normal runs or blank to include debugging statements in the SAS log.
Start_Of_Period - Start producing results as the month of this date.
End_Of_Period - Stop producing results as the month of this date.
Jim
And here is the SAS code. Please ask questions as needed.
Note: In my autoexec.sas, I have some definitions:
%LET Err1 = ERROR:;
%LET Nte1 = NOTE:;
%LET Nte2 = NOTE-;
Jim
** Debug option. Asterisk is no debugging. Blank for debugging. **;
%LET Db = ;
** Start producing results as of this month **;
%LET Start_Of_Period = '01JAN2017'd;
** Stop producing results as of this month **;
%LET End_Of_Period = '30APR2018'd;
** Extend the overlap period by +/- this number of days. **;
%LET Extra_Days = 3;
LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200921_M.xlsx";
DATA Rx_Dates;
SET DrugDate.'#LN00067'n;
RUN;
**------------------------------------------------------------------------------**;
DATA Rx_Period_Evaluated;
KEEP ID Drug_Date Both;
LENGTH ID 8;
LENGTH Drug_Date 8.;
FORMAT Drug_Date YYMMD7.;
LENGTH End_of_Taking 8.;
FORMAT End_of_Taking MMDDYYS10.;
LENGTH Start_of_Period 8.;
FORMAT Start_of_Period MMDDYYS10.;
LENGTH End_of_Period 8.;
FORMAT End_of_Period MMDDYYS10.;
LENGTH Save_Start_Date 8.;
FORMAT Save_Start_Date MMDDYYS10.;
LENGTH Save_End_Date 8.;
FORMAT Save_End_Date MMDDYYS10.;
LENGTH Temp_Start_Date 8.;
FORMAT Temp_Start_Date MMDDYYS10.;
LENGTH Temp_End_Date 8.;
FORMAT Temp_End_Date MMDDYYS10.;
RETAIN Max_ID 0;
IF _N_ = 1 THEN
DO;
CALL MISSING (Drug_Date);
DECLARE HASH Hsh_A_Dates ();
Hsh_A_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_A_Dates.DefineData ('Start_Taking_Date', 'Last_Taking_Date');
Hsh_A_Dates.DefineDone ();
DECLARE HASH Hsh_B_Dates ();
Hsh_B_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_B_Dates.DefineData ('Start_Taking_Date', 'Last_Taking_Date');
Hsh_B_Dates.DefineDone ();
END;
** Load drug A hash table. **;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug A";
DO WHILE(NOT End_Of_A);
SET Rx_Dates
END = End_Of_A;
WHERE Drug = 'A';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_A=;
IF ID > Max_ID THEN
DO;
&Db PUTLOG "&Nte2 Changing Max_ID from to " Max_ID= ID=;
Max_ID = ID;
END;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
LINK Chk_Dates;
_RC_A = Hsh_A_Dates.ADD();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
Needs_Update = 0;
LINK Replace_A;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
** Load drug B hash table. **;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug B";
DO WHILE(NOT End_Of_B);
SET Rx_Dates
END = End_Of_B;
WHERE Drug = 'B';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_B=;
IF ID > Max_ID THEN
DO;
&Db PUTLOG "&Nte2 Changing Max_ID from to " Max_ID= ID=;
Max_ID = ID;
END;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
LINK Chk_Dates;
_RC_B = Hsh_B_Dates.ADD();
IF _RC_B > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
Needs_Update = 0;
LINK Replace_B;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
Start_Of_Period = INTNX('MONTH', &Start_Of_Period, 0);
End_Of_Period = INTNX('MONTH', &End_Of_Period, 1) - 1;
** Evaluate Drug A and Drug B tables month by month by ID. **;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Evaluating Drug A and Drug B tables";
DO ID = 1 TO Max_ID;
Drug_Date = Start_Of_Period;
DO WHILE(Drug_Date <= End_Of_Period);
_RC_A = Hsh_A_Dates.FIND();
Temp_Start_Date = Start_Taking_Date;
Temp_End_Date = Last_Taking_Date;
_RC_B = Hsh_B_Dates.FIND();
&Db PUTLOG "&Nte1 " ID= Drug_Date= End_Of_Period= _RC_A= _RC_B=;
IF _RC_A = 0 AND
_RC_B = 0 THEN
DO;
IF (Start_Taking_Date - &Extra_Days <= Temp_Start_Date <= Last_Taking_Date + &Extra_Days) OR
(Temp_Start_Date - &Extra_Days <= Start_Taking_Date <= Temp_End_Date + &Extra_Days) OR
(Start_Taking_Date - &Extra_Days <= Temp_End_Date <= Last_Taking_Date + &Extra_Days) OR
(Temp_Start_Date - &Extra_Days <= Last_Taking_Date <= Temp_End_Date + &Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Dates within range " ID= Temp_Start_Date= Temp_End_Date= Start_Taking_Date= Last_Taking_Date=;
END;
ELSE
DO;
Both = 0;
&Db PUTLOG "&Nte2 No match on dates " ID= Temp_Start_Date= Temp_End_Date= Start_Taking_Date= Last_Taking_Date=;
END;
END;
ELSE
DO;
Both = 0;
&Db PUTLOG "&Nte2 Both months not populated " ID= Drug_Date= End_Of_Period= Both= _RC_A= _RC_B=;
END;
OUTPUT;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
STOP;
*********;
Replace_A:
*********;
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_A = Hsh_A_Dates.REPLACE();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
******;
RETURN;
******;
*********;
Replace_B:
*********;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_B = Hsh_B_Dates.REPLACE();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
******;
RETURN;
******;
*********;
Chk_Dates:
*********;
&Db PUTLOG "&Nte2 Chk_Dates Begin " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
IF Start_Taking_Date < Drug_Date THEN
Start_Taking_Date = Drug_Date;
IF Last_Taking_Date > INTNX('MONTH', Drug_Date, 1) - 1 THEN
Last_Taking_Date = INTNX('MONTH', Drug_Date, 1) - 1;
&Db PUTLOG "&Nte2 Chk_Dates End " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
******;
RETURN;
******;
**********;
Updt_Dates:
**********;
Temp_Start_Date = Start_Taking_Date;
Temp_End_Date = Last_Taking_Date;
IF Save_Start_Date < Start_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update start before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Start_Taking_Date = Save_Start_Date;
LINK Chk_Dates;
IF Start_Taking_Date < Temp_Start_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update start after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
IF Save_End_Date > Last_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update end before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Last_Taking_Date = Save_End_Date;
LINK Chk_Dates;
IF Last_Taking_Date > Temp_End_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update end after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
******;
RETURN;
******;
RUN;
Hi Jim,
You don't know how much I appreciate your help with my project!!! It really makes my work moving forward!!! I tried your code for both definition 1 and definition 2 based on my dataset, and they all worked very well!
I took your advice to test the code with a new dataset. I am attaching the new dataset here. I found the code for definition 1 worked for the new data very correctly, however, it cannot provide the precise result when applying to definition 2 for ID=4 (for ID=1-3, it worked very well). For example, the overlapping uses for ID=4 should be 2/4/2017 to 3/3/2017 and 5/3/2017, that is the variable Both=1 for 2017-02, 03, and 05. But SAS only returned as 2017-02. I changed the %LET Extra_days=3 when testing the code for definition 2. I resolved this issue by creating a new variable called last_taking_date_Ext=last_taking_date + 3, that is although the actual last taking date is on 2/28/2017 or 4/30/2017, the effect of the drug lasts until 3/3/2017 and 5/3/2017. Is there any way to fix the code to let it more appropriate for my new dataset?
I know you have generously spent your time on my project for very long time, I sincerely appreciate your great advice and help!!!
Best regards,
C
I've got my program mostly working, but one month for ID 4 disagrees with what I think it should be.
When you have a moment, would you look over the following table that I put together by hand. This is what I think my program should output (once I get it debugged). Would you verify that the Definition 1 and Definition 2 values (0 or 1) are correct? If they're not correct, would you please maybe make a note of why a particular value is incorrect? That would be super helpful.
Thank you,
Jim
Hi, @CynthiaWei,
I think I've got everything working now. Sorry this took a while. Busy weekend and busy at work right now. I have to fit SAS community coding in where I can.
Here are my results which I believe match the requirements. Definition 1 is on the left. Definition 2 is on the right. I'll post the code below that. Please let me know if this works for you.
Jim
** Debug option. Asterisk is no debugging. Blank for debugging. **;
%LET Db = ;
** Start producing results as of this month **;
%LET Start_Of_Period = '01JAN2017'd;
** Stop producing results as of this month **;
%LET End_Of_Period = '30APR2018'd;
** Extend the overlap period by +/- this number of days. **;
%LET Extra_Days = 3;
%LET Extended_Data = NO;
*%LET Extended_Data = YES;
**------------------------------------------------------------------------------**;
*LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200921_M.xlsx";
LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200926_M.xlsx";
**------------------------------------------------------------------------------**;
DATA Rx_Dates;
SET DrugDate.'#LN00067'n;
RUN;
**------------------------------------------------------------------------------**;
DATA Rx_Period_Evaluated (KEEP=ID
%IF %UPCASE(&Extended_Data) = YES %THEN
%DO;
A_Start_Date A_End_Date B_Start_Date B_End_Date
%END;
Drug_Date Both)
A_Dates (KEEP=ID Drug_Date Start_Taking_Date Last_Taking_Date)
B_Dates (KEEP=ID Drug_Date Start_Taking_Date Last_Taking_Date)
;
LENGTH ID 8;
LENGTH Drug_Date 8.;
FORMAT Drug_Date YYMMDDS10.;
FORMAT Save_Drug_Date YYMMDDS10.;
LENGTH End_of_Taking 8.;
FORMAT End_of_Taking MMDDYYS10.;
LENGTH Start_of_Period 8.;
FORMAT Start_of_Period MMDDYYS10.;
LENGTH End_of_Period 8.;
FORMAT End_of_Period MMDDYYS10.;
LENGTH Save_Start_Date 8.;
FORMAT Save_Start_Date MMDDYYS10.;
LENGTH Save_End_Date 8.;
FORMAT Save_End_Date MMDDYYS10.;
LENGTH Temp_Start_Date 8.;
FORMAT Temp_Start_Date MMDDYYS10.;
LENGTH Temp_End_Date 8.;
FORMAT Temp_End_Date MMDDYYS10.;
FORMAT A_Start_Date MMDDYYS10.;
FORMAT A_End_Date MMDDYYS10.;
FORMAT B_Start_Date MMDDYYS10.;
FORMAT B_End_Date MMDDYYS10.;
RETAIN Max_ID 0;
RETAIN Extra_Days 0;
**------------------------------**;
** Define hash tables. **;
**------------------------------**;
IF _N_ = 1 THEN
DO;
Extra_Days = &Extra_Days;
CALL MISSING (Drug_Date);
DECLARE HASH Hsh_A_Dates (ORDERED: 'A', MULTIDATA: 'N');
Hsh_A_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_A_Dates.DefineData ('ID', 'Drug_Date', 'Start_Taking_Date', 'Last_Taking_Date');
Hsh_A_Dates.DefineDone ();
DECLARE HITER Hit_A_Dates('Hsh_A_Dates');
DECLARE HASH Hsh_B_Dates (ORDERED: 'A', MULTIDATA: 'N');
Hsh_B_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_B_Dates.DefineData ('ID', 'Drug_Date', 'Start_Taking_Date', 'Last_Taking_Date');
Hsh_B_Dates.DefineDone ();
DECLARE HITER Hit_B_Dates('Hsh_B_Dates');
END;
**------------------------------**;
** Load drug A hash table. **;
**------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug A";
DO WHILE(NOT End_Of_A);
SET Rx_Dates
END = End_Of_A;
WHERE Drug = 'A';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_A=;
IF ID > Max_ID THEN
DO;
&Db PUTLOG "&Nte2 Changing Max_ID from to " Max_ID= ID=;
Max_ID = ID;
END;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
LINK Chk_Dates;
_RC_A = Hsh_A_Dates.ADD();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
Needs_Update = 0;
LINK Replace_A;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
**------------------------------**;
** Load drug B hash table. **;
**------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug B";
DO WHILE(NOT End_Of_B);
SET Rx_Dates
END = End_Of_B;
WHERE Drug = 'B';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_B=;
IF ID > Max_ID THEN
DO;
&Db PUTLOG "&Nte2 Changing Max_ID from to " Max_ID= ID=;
Max_ID = ID;
END;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
LINK Chk_Dates;
_RC_B = Hsh_B_Dates.ADD();
IF _RC_B > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
Needs_Update = 0;
LINK Replace_B;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
**--------------------------------------------------------------**;
** Initialize the start and end of the period of interest. **;
**--------------------------------------------------------------**;
Start_Of_Period = INTNX('MONTH', &Start_Of_Period, 0);
End_Of_Period = INTNX('MONTH', &End_Of_Period, 1) - 1;
**--------------------------------------------------------------**;
** Create SAS data sets from drug tables for QC purposes. **;
**--------------------------------------------------------------**;
_RC_A = 0;
ID = 0;
Drug_Date = Start_Of_Period;
_RC_A = Hit_A_Dates.FIRST();
DO WHILE (_RC_A = 0);
OUTPUT A_Dates;
_RC_A = Hit_A_Dates.NEXT();
END;
_RC_B = 0;
ID = 0;
Drug_Date = Start_Of_Period;
_RC_B = Hit_B_Dates.FIRST();
DO WHILE (_RC_B = 0);
OUTPUT B_Dates;
_RC_B = Hit_B_Dates.NEXT();
END;
**--------------------------------------------------------------**;
** Evaluate Drug A and Drug B tables month by month by ID. **;
**--------------------------------------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Evaluating Drug A and Drug B tables";
DO ID = 1 TO Max_ID;
Drug_Date = Start_Of_Period;
DO WHILE(Drug_Date <= End_Of_Period);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
A_Start_Date = Start_Taking_Date;
A_End_Date = Last_Taking_Date;
END;
ELSE
DO;
CALL MISSING(A_Start_Date, A_End_Date);
END;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
B_Start_Date = Start_Taking_Date;
B_End_Date = Last_Taking_Date;
END;
ELSE
DO;
CALL MISSING(B_Start_Date, B_End_Date);
END;
&Db PUTLOG "&Nte1 " ID= Drug_Date= End_Of_Period= _RC_A= _RC_B=;
IF _RC_A = 0 OR
_RC_B = 0 THEN
DO;
IF _RC_A > 0 AND
Extra_Days > 0 THEN
DO;
LINK A_Not_Found;
END;
ELSE
IF _RC_B > 0 AND
Extra_Days > 0 THEN
DO;
LINK B_Not_Found;
END;
IF NOT NMISS(B_Start_Date, A_Start_Date, B_End_Date, Extra_Days) AND
(B_Start_Date - Extra_Days <= A_Start_Date <= B_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. A_Start between B_Start and B_Stop " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(A_Start_Date, Extra_Days, B_Start_Date, A_End_Date) AND
(A_Start_Date - Extra_Days <= B_Start_Date <= A_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. B_Start between A_Start and A_End " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(B_Start_Date, Extra_Days, A_End_Date, B_End_Date, Extra_Days) AND
(B_Start_Date - Extra_Days <= A_End_Date <= B_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. A_End between B_start and B_stop " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(A_Start_Date, Extra_Days, B_End_Date, A_End_Date, Extra_Days) AND
(A_Start_Date - Extra_Days <= B_End_Date <= A_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. B_End between A_Start and A_End " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
DO;
Both = 0;
&Db PUTLOG "&Nte2 No match on dates " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date=;
END;
END;
ELSE
DO;
Both = 0;
&Db PUTLOG "&Nte2 Both months not populated " ID= Drug_Date= End_Of_Period= Both= _RC_A= _RC_B=;
END;
OUTPUT Rx_Period_Evaluated;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
END;
STOP;
*********;
Replace_A:
*********;
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_A = Hsh_A_Dates.REPLACE();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
******;
RETURN;
******;
*********;
Replace_B:
*********;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_B = Hsh_B_Dates.REPLACE();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
******;
RETURN;
******;
*********;
Chk_Dates:
*********;
&Db PUTLOG "&Nte2 Chk_Dates Begin " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
IF Start_Taking_Date < Drug_Date THEN
Start_Taking_Date = Drug_Date;
IF Last_Taking_Date > INTNX('MONTH', Drug_Date, 1) - 1 THEN
Last_Taking_Date = INTNX('MONTH', Drug_Date, 1) - 1;
&Db PUTLOG "&Nte2 Chk_Dates End " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
******;
RETURN;
******;
**********;
Updt_Dates:
**********;
Temp_Start_Date = Start_Taking_Date;
Temp_End_Date = Last_Taking_Date;
IF Save_Start_Date < Start_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update start before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Start_Taking_Date = Save_Start_Date;
LINK Chk_Dates;
IF Start_Taking_Date < Temp_Start_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update start after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
IF Save_End_Date > Last_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update end before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Last_Taking_Date = Save_End_Date;
LINK Chk_Dates;
IF Last_Taking_Date > Temp_End_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update end after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
******;
RETURN;
******;
***********;
A_Not_Found:
***********;
** Save keys. **;
Save_Drug_Date = Drug_Date;
** Get start date from prior end date. **;
Drug_Date = INTNX('MONTH', Drug_Date, -1);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
&Db PUTLOG "&Nte2 Prior A found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date=;
A_Start_Date = Last_Taking_Date;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Prior A not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date=;
CALL MISSING(A_Start_Date);
END;
Drug_Date = Save_Drug_Date;
** Get end date from next start date. **;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
&Db PUTLOG "&Nte2 Next A found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date=;
A_End_Date = Start_Taking_Date;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Next A not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date=;
CALL MISSING(A_End_Date);
END;
Drug_Date = Save_Drug_Date;
******;
RETURN;
******;
***********;
B_Not_Found:
***********;
** Save keys. **;
Save_Drug_Date = Drug_Date;
** Get start date from prior end date. **;
Drug_Date = INTNX('MONTH', Drug_Date, -1);
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
&Db PUTLOG "&Nte2 Prior B found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date=;
B_Start_Date = Last_Taking_Date;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Prior B not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date=;
CALL MISSING(B_Start_Date);
END;
Drug_Date = Save_Drug_Date;
** Get end date from next start date. **;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
&Db PUTLOG "&Nte2 Next B found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date=;
B_End_Date = Start_Taking_Date;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Next B not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date=;
CALL MISSING(B_End_Date);
END;
Drug_Date = Save_Drug_Date;
******;
RETURN;
******;
RUN;
**------------------------------------------------------------------------------**;
Did that latest version of the code work? It should be able to accommodate all of the conditions in the data I've seen so far.
Jim
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.