Hi Jim,
I am so sorry for my late response.
I think your code that was posted on 9/26/2020 12:22pm matches best with my desired result. The only issue with this code is that the ID in my test data is from 1 and it is continuous, 1, 2, 3, and 4. However, the ID in my real data is not continuous and not start from 1. It is like 6, 10, 21, 23, 24, 29, and so on. How can I fix the code to accommodate it to my real data in terms of ID? Thank you so much!
Back to the table you put together by hand (I really appreciate that you did this for checking!!!). I think we have no problem with definition 1 since it is straightforward. I know definition 2 is very tricky and needs to have a better explanation. Let think in this way for definition 2.Drug A and B are not allowed to take simultaneously based on clinical guideline. Both drug A and B have a delayed effect (not an effect that could affect the previous drug), which means that blood concentration for each of them can still be tested even 3 days AFTER the last taking date of the other drug, whichever drug takes first. For ID=1, the actual effect for drug B is from 1/1/2017 to 7/2/2017 (3days after 6/29/2017). The actual effect of drug A of the 2nd prescription for ID=1 is from 2/1/2017 to 4/4/4/2017, the 3rd is from 3/1/2017 to 5/2/2017 (3 days after 4/29), and the 4th is from 4/1/2017 to 6/2/2017 (3 days after 5/20/2017). The actual period that drug B actually had effect on ID=1 covered the period that drug A was taking, and the overlapping use happened in 2017-02, 03, 04, 05, and 06, but not 2017-01. This is because ID=1 only took drug B but not drug A, so there was no overlapping use happened in 2017-01. So, same for ID=4, the 1st prescription should be considered from 1/4/2017 to 3/3/2017, and the 3rd prescription should be from 3/7/2017 to 5/3/2017, so the overlapping use for ID=1 happened in 2017-02, 03, and 05. For ID=2, I think the overlapping use happened in 2017-06 and 07. For ID=3, I think the overlapping use happened in 2017-09, 10, 11 and 12. Do you agree with me?
I think one solution is to create a new variable titled last_taking_date_Ext by simply adding 3 to last_taking_take (last_taking_date_Ext=last_taking_date + 3), and this last_taking_date_Ext that represents the actual last date for a specific drug is going to be replaced all the last_taking_date in the code when applying definition 2. Does it make sense?
Many many thanks for your effort and time again!!! I really this great learning opportunity!!!
Best regards,
C
Ah. I think I see. The drug's effect last three days after the individual physically stops taking the drug. So, really only the Last_Taking_Date is affected. I was adjusting the Start_Taking_Date by -3 as well as the Last_Taking_Date by +3. I should only be adjusting the Last_Taking_Date by +3. Let me revise the code and re-run.
Let me think about the idea of discontinuous ID variables. I think it shouldn't be too hard to accommodate them, but it will require an adjustment to the code. Can you post another set of test data with discontinuous ID variables? I can do it, but it's better if you do it since you understand the data and the requirements.
Jim
Hi Jim,
I think I know what is going on there. By using your code, SAS provides 0/1 for variable-Both from ID=1 until the max ID the dataset has, regardless whether any ID between 1 to max ID exists or not in the dataset. For example, if the dataset has ID=3, 6, 12, and 15 (only 4 people in the dataset), SAS will provide results for ID=1 to 15, even though 1, 2, 4, 5, 7, 8, 9, 10, 11, 13, and 14 are not in the dataset. To solve this issue, can I create a hash table including the IDs the dataset actually has and use that hash table to find the IDs that I am interested in from the result table?
I have another question. I am not very familiar with macro variable, so when I substitute your code with my actual variable name, SAS log showed apparent symbolic is not resolved. I think this issue happened when I was trying to replace drug 'A' and 'B' with my actual drug name. What else code should I adjust as well when I want to replace A with antidepressant and replace B with sedative? Should I replace the part inside of '&Db PUTLOG "&Nte1 Evaluating Drug A and Drug B tables"; '? I know I shouldn't have asked this simple question, I just want to make sure I understand your code.
I know I have raised so many questions on this project. I really don't want to lose this rare learning opportunity to extend my SAS skills! Again and Again, I really appreciate your help a lot!!!
Best regards,
C
Cynthia,
You are quite right about the ID issue. The program as it was would create data for ID's across a continuous range. If you only had ID's 1 and 13, the program would have created "dummy" data for non-existent ID's 2 through 12. Your idea of using a Hash table is exactly right -- and exactly how I solved the problem yesterday after last we spoke. See new code, below.
I have a new version of the program, and it will only process ID's that actually exist in the input data, using a hash table to control which ID's are processed. The new program is also adjusted to only consider three extra days after the Last_Taking_Date but not before the Start_Taking_Date. Please see the code below, and, below the code, the side by side Definition 1 and Definition 2 results.
I'm not sure what kind of macro variable problem you're having. Make sure you have the following in your code:
%LET Nte1 = NOTE:;
%LET Nte2 = NOTE-;
These are in my autoexec.sas and won't show up in the code I post below. If you're still having a problem after you make sure the above two %LET statements are in your code, then post the log messages. I would need to see exactly what SAS is complaining about.
The names of the drugs should not matter. The program will use whatever is in the data.
Jim
**------------------------------------------------------------------------------**;
** Debug option. Asterisk is no debugging. Blank for debugging. **;
%LET Db = ;
** Start producing results as of this month **;
%LET Start_Of_Period = '01JAN2017'd;
** Stop producing results as of this month **;
%LET End_Of_Period = '30APR2018'd;
** Extend the end of the overlap period by this number of days. **;
%LET Extra_Days = 3;
**------------------------------------------------------------------------------**;
*LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200921_M.xlsx";
LIBNAME DrugDate XLSX "&Path\SampleData\test_CW_20200926_M.xlsx";
**------------------------------------------------------------------------------**;
DATA WORK.Rx_Dates;
SET DrugDate.'#LN00067'n;
RUN;
**------------------------------------------------------------------------------**;
DATA Rx_Period_Evaluated (KEEP=ID
Drug_Date Both)
ID_Table(KEEP=ID)
A_Dates (KEEP=ID Drug_Date Start_Taking_Date Last_Taking_Date)
B_Dates (KEEP=ID Drug_Date Start_Taking_Date Last_Taking_Date)
;
LENGTH ID 8;
LENGTH Drug_Date 8.;
FORMAT Drug_Date YYMMDDS10.;
FORMAT Save_Drug_Date YYMMDDS10.;
LENGTH End_of_Taking 8.;
FORMAT End_of_Taking MMDDYYS10.;
LENGTH Start_of_Period 8.;
FORMAT Start_of_Period MMDDYYS10.;
LENGTH End_of_Period 8.;
FORMAT End_of_Period MMDDYYS10.;
LENGTH Save_Start_Date 8.;
FORMAT Save_Start_Date MMDDYYS10.;
LENGTH Save_End_Date 8.;
FORMAT Save_End_Date MMDDYYS10.;
LENGTH Temp_Start_Date 8.;
FORMAT Temp_Start_Date MMDDYYS10.;
LENGTH Temp_End_Date 8.;
FORMAT Temp_End_Date MMDDYYS10.;
FORMAT A_Start_Date MMDDYYS10.;
FORMAT A_End_Date MMDDYYS10.;
FORMAT B_Start_Date MMDDYYS10.;
FORMAT B_End_Date MMDDYYS10.;
RETAIN Extra_Days 0;
**------------------------------**;
** Define hash tables. **;
**------------------------------**;
IF _N_ = 1 THEN
DO;
Extra_Days = &Extra_Days;
CALL MISSING (Drug_Date);
DECLARE HASH Hsh_ID_Table (ORDERED: 'A', MULTIDATA: 'N', DATASET: 'WORK.Rx_Dates');
Hsh_ID_Table.DefineKey ('ID');
Hsh_ID_Table.DefineData ('ID');
Hsh_ID_Table.DefineDone ();
DECLARE HITER Hit_ID_Table('Hsh_ID_Table');
DECLARE HASH Hsh_A_Dates (ORDERED: 'A', MULTIDATA: 'N');
Hsh_A_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_A_Dates.DefineData ('ID', 'Drug_Date', 'Start_Taking_Date', 'Last_Taking_Date');
Hsh_A_Dates.DefineDone ();
DECLARE HITER Hit_A_Dates('Hsh_A_Dates');
DECLARE HASH Hsh_B_Dates (ORDERED: 'A', MULTIDATA: 'N');
Hsh_B_Dates.DefineKey ('ID', 'Drug_Date');
Hsh_B_Dates.DefineData ('ID', 'Drug_Date', 'Start_Taking_Date', 'Last_Taking_Date');
Hsh_B_Dates.DefineDone ();
DECLARE HITER Hit_B_Dates('Hsh_B_Dates');
ID_Cnt = Hsh_ID_Table.num_items;
&Db PUTLOG "&Nte1 The number of IDs loaded is " ID_Cnt=;
END;
**------------------------------**;
** Load drug A hash table. **;
**------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug A";
DO WHILE(NOT End_Of_A);
SET WORK.Rx_Dates
END = End_Of_A;
WHERE Drug = 'A';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_A=;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
LINK Chk_Dates;
_RC_A = Hsh_A_Dates.ADD();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
Needs_Update = 0;
LINK Replace_A;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
**------------------------------**;
** Load drug B hash table. **;
**------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Loading Drug B";
DO WHILE(NOT End_Of_B);
SET WORK.Rx_Dates
END = End_Of_B;
WHERE Drug = 'B';
Save_Start_Date = Start_Taking_Date;
Save_End_Date = Last_Taking_Date;
&Db PUTLOG "&Nte2 Start processing " ID= Save_Start_Date= Save_End_Date= _RC_B=;
Drug_Date = INTNX('MONTH', Start_Taking_Date, 0);
End_of_Taking = INTNX('MONTH', Last_Taking_Date, 1) - 1;
&Db PUTLOG "&Nte2 " ID= Drug_Date= End_of_Taking=;
DO WHILE(Drug_Date <= End_of_Taking);
&Db PUTLOG "&Nte2 " ID= "Iterate DO WHILE " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
LINK Chk_Dates;
_RC_B = Hsh_B_Dates.ADD();
IF _RC_B > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " already exists " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
Needs_Update = 0;
LINK Replace_B;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Add Successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
&Db PUTLOG "&Nte2 ";
Drug_Date = INTNX('MONTH', Drug_Date, 1);
Start_Taking_Date = Save_Start_Date;
Last_Taking_Date = Save_End_Date;
END;
END;
**--------------------------------------------------------------**;
** Initialize the start and end of the period of interest. **;
**--------------------------------------------------------------**;
Start_Of_Period = INTNX('MONTH', &Start_Of_Period, 0);
End_Of_Period = INTNX('MONTH', &End_Of_Period, 1) - 1;
**--------------------------------------------------------------**;
** Create SAS data sets from the hash tables for QC purposes. **;
**--------------------------------------------------------------**;
_RC_ID = Hit_ID_Table.FIRST();
DO WHILE (_RC_ID = 0);
OUTPUT ID_Table;
_RC_ID = Hit_ID_Table.NEXT();
END;
_RC_A = Hit_A_Dates.FIRST();
DO WHILE (_RC_A = 0);
OUTPUT A_Dates;
_RC_A = Hit_A_Dates.NEXT();
END;
_RC_B = Hit_B_Dates.FIRST();
DO WHILE (_RC_B = 0);
OUTPUT B_Dates;
_RC_B = Hit_B_Dates.NEXT();
END;
**--------------------------------------------------------------**;
** Evaluate Drug A and Drug B tables month by month by ID. **;
**--------------------------------------------------------------**;
&Db PUTLOG "&Nte2 ";
&Db PUTLOG "&Nte1 Evaluating Drug A and Drug B tables";
_RC_ID = Hit_ID_Table.FIRST();
DO WHILE (_RC_ID = 0);
Drug_Date = Start_Of_Period;
DO WHILE(Drug_Date <= End_Of_Period);
Both = 0;
CALL MISSING(A_Start_Date, A_End_Date);
CALL MISSING(B_Start_Date, B_End_Date);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
A_Start_Date = Start_Taking_Date;
A_End_Date = Last_Taking_Date;
END;
ELSE
DO;
CALL MISSING(A_Start_Date, A_End_Date);
END;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
B_Start_Date = Start_Taking_Date;
B_End_Date = Last_Taking_Date;
END;
ELSE
DO;
CALL MISSING(B_Start_Date, B_End_Date);
END;
&Db PUTLOG "&Nte1 " ID= Drug_Date= End_Of_Period= _RC_A= _RC_B= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date=;
LINK Chk_Date_Match;
IF Extra_Days > 0 AND
Both = 0 THEN
DO;
IF _RC_A > 0 THEN
DO;
LINK Chk_Prior_A;
LINK Chk_Date_Match;
END;
IF _RC_B > 0 AND
Both = 0 THEN
DO;
LINK Chk_Prior_B;
LINK Chk_Date_Match;
END;
IF Both = 0 THEN
DO;
LINK Chk_Prior_A;
LINK Chk_Date_Match;
END;
IF Both = 0 THEN
DO;
LINK Chk_Prior_B;
LINK Chk_Date_Match;
END;
END;
*&Db PUTLOG "&Nte2 Both months not populated " ID= Drug_Date= End_Of_Period= Both= _RC_A= _RC_B=;
OUTPUT Rx_Period_Evaluated;
Drug_Date = INTNX('MONTH', Drug_Date, 1);
END;
_RC_ID = Hit_ID_Table.NEXT();
END;
STOP;
*********;
Replace_A:
*********;
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_A = Hsh_A_Dates.REPLACE();
IF _RC_A > 0 THEN
DO;
&Db PUTLOG "&Nte2 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_A=;
END;
******;
RETURN;
******;
*********;
Replace_B:
*********;
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on FIND " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
LINK Updt_Dates;
IF Needs_Update THEN
DO;
_RC_B = Hsh_B_Dates.REPLACE();
IF _RC_B > 0 THEN
DO;
PUTLOG "&Err1 " ID= " Error on replace " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " Replace successful " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 " ID= " No update needed " Drug_Date= Start_Taking_Date= Last_Taking_Date= End_of_Taking= _RC_B=;
END;
******;
RETURN;
******;
*********;
Chk_Dates:
*********;
&Db PUTLOG "&Nte2 Chk_Dates Begin " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
IF Start_Taking_Date < Drug_Date THEN
Start_Taking_Date = Drug_Date;
IF Last_Taking_Date > INTNX('MONTH', Drug_Date, 1) - 1 THEN
Last_Taking_Date = INTNX('MONTH', Drug_Date, 1) - 1;
&Db PUTLOG "&Nte2 Chk_Dates End " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
******;
RETURN;
******;
**********;
Updt_Dates:
**********;
Temp_Start_Date = Start_Taking_Date;
Temp_End_Date = Last_Taking_Date;
IF Save_Start_Date < Start_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update start before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Start_Taking_Date = Save_Start_Date;
LINK Chk_Dates;
IF Start_Taking_Date < Temp_Start_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update start after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
IF Save_End_Date > Last_Taking_Date THEN
DO;
&Db PUTLOG "&Nte2 Update end before " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date=;
Last_Taking_Date = Save_End_Date;
LINK Chk_Dates;
IF Last_Taking_Date > Temp_End_Date THEN
Needs_Update = 1;
&Db PUTLOG "&Nte2 Update end after " ID= Start_Taking_Date= Last_Taking_Date= Save_Start_Date= Save_End_Date= Needs_Update=;
END;
******;
RETURN;
******;
***********;
Chk_Prior_A:
***********;
** Save keys. **;
Save_Drug_Date = Drug_Date;
** Get start date from prior end date. **;
Drug_Date = INTNX('MONTH', Drug_Date, -1);
_RC_A = Hsh_A_Dates.FIND();
IF _RC_A = 0 THEN
DO;
&Db PUTLOG "&Nte2 Prior A found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date= Extra_Days=;
A_Start_Date = Last_Taking_Date + Extra_Days;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Prior A not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= A_Start_Date= A_End_Date=;
CALL MISSING(A_Start_Date);
END;
Drug_Date = Save_Drug_Date;
******;
RETURN;
******;
***********;
Chk_Prior_B:
***********;
** Save keys. **;
Save_Drug_Date = Drug_Date;
** Get start date from prior end date. **;
Drug_Date = INTNX('MONTH', Drug_Date, -1);
_RC_B = Hsh_B_Dates.FIND();
IF _RC_B = 0 THEN
DO;
&Db PUTLOG "&Nte2 Prior B found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date= Extra_Days=;
B_Start_Date = Last_Taking_Date + Extra_Days;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 Prior B not found " ID= Drug_Date= Start_Taking_Date= Last_Taking_Date= B_Start_Date= B_End_Date=;
CALL MISSING(B_Start_Date);
END;
Drug_Date = Save_Drug_Date;
******;
RETURN;
******;
**************;
Chk_Date_Match:
**************;
IF NOT NMISS(B_Start_Date, A_Start_Date, B_End_Date, Extra_Days) AND
(B_Start_Date <= A_Start_Date <= B_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. A_Start between B_Start and B_Stop " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(A_Start_Date, B_Start_Date, A_End_Date, Extra_Days) AND
(A_Start_Date <= B_Start_Date <= A_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. B_Start between A_Start and A_End " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(B_Start_Date, A_End_Date, B_End_Date, Extra_Days) AND
(B_Start_Date <= A_End_Date <= B_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. A_End between B_start and B_stop " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
IF NOT NMISS(A_Start_Date, B_End_Date, A_End_Date, Extra_Days) AND
(A_Start_Date <= B_End_Date <= A_End_Date + Extra_Days) THEN
DO;
Both = 1;
&Db PUTLOG "&Nte2 Both. B_End between A_Start and A_End " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date= Extra_Days=;
END;
ELSE
DO;
&Db PUTLOG "&Nte2 No match on dates " ID= A_Start_Date= A_End_Date= B_Start_Date= B_End_Date=;
END;
******;
RETURN;
******;
RUN;
**------------------------------------------------------------------------------**;
Hi Jim,
I am so so so sorry for my delayed response. Our medical school's remote server had an issue in a change they made to improve internet browsing, which prevented users from accessing outside websites. The new platform has just be fixed. Now I am able to login my SAS community account and have a chance to try your new code.
Yeah!!! The code in your last response works very well!!! Hats off to you! You are a REAL SAS MASTER!!! I appreciate it a LOT!!! Without your help, my project couldn't be completed. Also, I really appreciate this learning experience. Again, I am very grateful for your GREAT instruction and advice!!!
Best regards,
Cynthia
Hi, @CynthiaWei,
It's good to hear from you. I'm sorry to hear that you've had server problems.
That's great! 🙂 🙂 🙂 I'm excited to hear that the program is doing what you needed it to do. I know we both spent a lot of time discussing what it needed to do. It was a little bit tricky to program, so I'm glad that in the end it was successful.
My best for your success in school,
Jim
You have not show the desired output format;
I have run next code:
data have;
infile cards dlm='09'x;
informat Start_taking_Date last_taking_date mmddyy. ;
input ID
No_of_Prescription
Start_taking_Date
Day_Supply
last_taking_date
drug $;
cards;
1 1 1/2/2017 3 1/4/2017 A
1 2 1/4/2017 30 2/2/2017 B
1 3 1/4/2017 30 2/2/2017 A
1 4 1/4/2017 30 2/2/2017 A
1 5 4/7/2017 10 4/16/2017 B
1 6 4/7/2017 10 4/16/2017 A
1 7 4/14/2017 30 5/13/2017 A
1 8 4/17/2017 30 5/16/2017 A
1 9 5/9/2017 30 6/7/2017 B
1 10 5/16/2017 30 6/14/2017 A
1 11 1/17/2018 60 3/17/2018 B
1 12 1/23/2018 30 2/21/2018 A
1 13 2/13/2018 30 3/14/2018 A
1 14 2/15/2018 30 3/16/2018 B
1 15 3/14/2018 30 4/12/2018 A
1 16 3/14/2018 30 4/12/2018 A
1 17 4/1/2018 30 4/30/2018 B
2 3 2/9/2017 30 3/10/2017 B
2 4 2/9/2017 30 3/10/2017 B
2 5 2/16/2017 30 3/17/2017 A
2 6 3/13/2017 30 4/11/2017 B
2 7 3/13/2017 30 4/11/2017 B
2 8 3/17/2017 30 4/15/2017 A
2 9 4/10/2017 30 5/9/2017 B
2 10 4/15/2017 30 5/14/2017 A
2 11 5/14/2017 352 4/30/2018 B
2 12 5/14/2017 30 6/12/2017 A
; run;
data have_A have_B;
drop Start_taking_Date last_taking_date;
set have;
do day = Start_taking_Date to last_taking_date; end;
if drug = 'A' then output have_A;
else output have_B;
run;
proc sort data=have_A; by id day; run;
proc sort data=have_B; by id day; run;
data want;
merge have_A (in=inA rename=(day_supply=daysupA drug=drugA))
have_B (in=inB rename=(day_supply=daysupB drug=drugB));
by id day;
if inA and inB;
drugs = catx('+',drugA,drugB);
drop drugA drugB;
format day date9.;
run;
and got next results:
Hi,
Thank you so much for your code!
What my desired output is just exactly like Jim's result table.
I tried your code and didn't get your result table. What I got is that I can create have_A and have_B but failed to create dataset-want. It seems SAS has the same issue when I tried Jim's code. This is what SAS log provided (I am using a simpler dataset attached below):
NOTE: There were 6 observations read from the data set WORK.HAVE_A.
NOTE: There were 7 observations read from the data set WORK.HAVE_B.
NOTE: The data set WORK.WANT has 0 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Do you have any sense what is going wrong here?
Thank you very much!
Best regards,
C
I think you can do it something like this. First, create a table of overlaps:
proc sql;
create table overlap as select distinct
A.ID,
max(A.Start_taking_Date,B.Start_taking_Date) as start format=date9.,
min(A.last_taking_date,B.last_taking_date) as end format=date9.
from
have A join
have B on A.ID=B.ID
where A.drug='A'
and B.drug='B'
having start<=end
order by 1,2,3
;
quit;
Then use a data step to get the wanted values (the LAST_DATE macro variable is because you do not want anything after 2018-04):
%let last_date='01may2018'd;
data want;
set overlap;
by ID;
if first.ID or month<intnx('month',start,0) then month=intnx('month',start,0);
do while(month<=min(intnx('month',end,0),&last_date));
output;
month=intnx('month',month,1);
end;
retain month;
keep id month;
format month yymmd7.;
run;
If you need the second definition of overlaps, it can be done like this. First, change the HAVING clause in the SQL:
proc sql;
create table overlap as select distinct
A.ID,
max(A.Start_taking_Date,B.Start_taking_Date) as start format=date9.,
min(A.last_taking_date,B.last_taking_date) as end format=date9.
from
have A join
have B on A.ID=B.ID
where A.drug='A'
and B.drug='B'
having start<=end+3
order by 1,2,3
;
quit;
And then the data step, correct if START is after END:
%let last_date='01may2018'd;
data want;
set overlap;
by ID;
if start>end then
start=start-3;
if first.ID or month<intnx('month',start,0) then month=intnx('month',start,0);
do while(month<=min(intnx('month',end,0),&last_date));
output;
month=intnx('month',month,1);
end;
retain month;
keep id month;
format month yymmd7.;
run;
Using an array :
data have;
input ID No_of_Prescription Start_Date :mmddyy10. Day_Supply last_date :mmddyy10. drug $;
datalines;
1 1 1/2/2017 3 1/4/2017 A
1 2 1/4/2017 30 2/2/2017 B
1 3 1/4/2017 30 2/2/2017 A
1 4 1/4/2017 30 2/2/2017 A
1 5 4/7/2017 10 4/16/2017 B
1 6 4/7/2017 10 4/16/2017 A
1 7 4/14/2017 30 5/13/2017 A
1 8 4/17/2017 30 5/16/2017 A
1 9 5/9/2017 30 6/7/2017 B
1 10 5/16/2017 30 6/14/2017 A
1 11 1/17/2018 60 3/17/2018 B
1 12 1/23/2018 30 2/21/2018 A
1 13 2/13/2018 30 3/14/2018 A
1 14 2/15/2018 30 3/16/2018 B
1 15 3/14/2018 30 4/12/2018 A
1 16 3/14/2018 30 4/12/2018 A
1 17 4/1/2018 30 4/30/2018 B
2 3 2/9/2017 30 3/10/2017 B
2 4 2/9/2017 30 3/10/2017 B
2 5 2/16/2017 30 3/17/2017 A
2 6 3/13/2017 30 4/11/2017 B
2 7 3/13/2017 30 4/11/2017 B
2 8 3/17/2017 30 4/15/2017 A
2 9 4/10/2017 30 5/9/2017 B
2 10 4/15/2017 30 5/14/2017 A
2 11 5/14/2017 352 4/30/2018 B
2 12 5/14/2017 30 6/12/2017 A
;
proc sort data=have; by id drug start_date; run;
proc sql;
select intnx("day", min(start_date), -3), max(last_date)
into :arrayStart, :arrayLast
from have;
quit;
data overlaps;
array d {&arrayStart:&arrayLast} $1 _temporary_;
do until(last.id);
set have; by id;
do dt = intnx("day", start_date, -3) to last_date;
if missing(d{dt}) then d{dt} = drug;
else if d{dt} ne drug then output;
end;
end;
format dt yymmdd10.;
keep id dt;
run;
proc sql;
select unique
id,
intnx("month", dt, 0) as overlapMonth format=yymmd7.
from overlaps;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.