DATA Step, Macro, Functions and more

Concomitant medication use

Accepted Solution Solved
Reply
Contributor m_o
Contributor
Posts: 30
Accepted Solution

Concomitant medication use

Hello,

I am trying to define concomitant use of drugs. 

I have a great solution to my problem with a SAS data step at the link below:

https://communities.sas.com/t5/Base-SAS-Programming/Concomitant-drug-medication-use/m-p/351710#M8188...

However, the only issue is that I have a very large dataset (~600K rows) and those codes require a ton of memory and my computer crashes. I wonder if anybody could suggest an alternative, perhaps a proc sql solution, that would not require as much memory and computational capacity. 

Thank you very much! Any help would be greatly appreciated!

 

Katie

 


Accepted Solutions
Solution
‎05-01-2017 11:01 AM
Frequent Contributor
Posts: 93

Re: Concomitant medication use

This is not exactly elegant but it works:

 

data have;  
input ID DRUG $ START_DT MMDDYY10. DAYS_SUPP  END_DT MMDDYY10. ;
FORMAT START_DT END_DT MMDDYY10.;
cards; 
1	A	2/17/10	30	3/19/10
1	B	5/6/09	30	6/5/09
1	C	7/9/11	60	9/7/11
1	E	3/1/10	90	5/30/10
1	B	1/1/09	90	4/1/09
1	D	2/1/09	30	3/3/09
1	C	5/6/12	90	8/4/12
2	B	4/1/12	60	5/31/12
2	A	7/1/10	30	7/31/10
2	C	8/3/10	90	11/1/10
2	D	11/1/13	90	1/30/14
2	E	12/5/13	90	3/5/14
2	A	2/1/11	90	5/2/11
;
RUN;

PROC SQL;
CREATE TABLE HAVE AS
SELECT MONOTONIC() AS ROWID, * FROM HAVE;
QUIT;

%let fromDate = 01Jan2000;
%let toDate = 31DEC2016;
data alldates;
  length key 8; 
  do date = "&fromDate"d to "&toDate"d; 
    key + 1; 
    RSA_WorkdayInd = ( 2 <= weekday(date) <= 6);
    WeekendInd = (not RSA_WorkdayInd);
    Calendar_Week_Number = week(date, "V");
    DayOfWeek = put(date, downame3.);
    Fin_Year = year(intnx("year.7", date, 0));
    output; 
  end; 

  format 
    date date9.
  ;
run; 

PROC SQL;
CREATE TABLE OVERLAP AS
SELECT HAVE.*, HAVE2.ROWID AS ROWID2, HAVE2.ID AS ID2, HAVE2.START_DT AS START2, HAVE2.DAYS_SUPP AS DAYS2, HAVE2.END_DT AS END2 
, (SELECT COUNT(*) FROM alldates WHERE DATE >= HAVE.START_DT AND DATE >= HAVE2.START_DT AND DATE <= HAVE.END_DT AND DATE <= HAVE2.END_DT) AS OVERLAP
FROM HAVE, HAVE AS HAVE2
WHERE HAVE.ROWID NE HAVE2.ROWID
AND HAVE.ID = HAVE2.ID
AND (HAVE.START_DT <= HAVE2.END_DT) and (HAVE2.START_DT <= HAVE.END_DT);
QUIT;

View solution in original post


All Replies
Super User
Posts: 19,770

Re: Concomitant medication use

600K rows should not be causing issues. 

Can you identify which step causes the issues?

Contributor m_o
Contributor
Posts: 30

Re: Concomitant medication use

The first step. Thank you.
Respected Advisor
Posts: 4,919

Re: Concomitant medication use

The first step can be slightly optimized as:

 

data days(sortedby=id);
length drug $1;
set have;
do day = start_dt to end_dt;
    output;
    end;
format day yymmdd10.;
keep id drug day;
run;

The sortedby= dataset option might improve the sort performance of the following step.

Droping the useless variable dum saves space. (It was a remnant of a previous version of the code)

Defining the length of drug as 1 character will also save space.

 

See if you can get the first step and sort to work. If so, we will try to optimize the rest.

PG
Contributor m_o
Contributor
Posts: 30

Re: Concomitant medication use

Thank you very much for your response. I tried as you suggested but my SAS stopped working after processing about a billion rows. I might go the proc sql route instead. Thanks again!
Super User
Posts: 5,497

Re: Concomitant medication use

Regarding your link ...

 

The first step is a good idea at least in theory.  But it does expand the data set to perhaps 40M rows.  Is that where the process breaks down?

 

The second step is also a good idea (sorting the 40M rows).  Again, is this the source of the problem?

 

I'm hoping the problems lie in the DATA step that follows sorting.  Questions for that:

 

  • Are the drug names really A, B, C, D, E?  (If not, how many characters are required for each name?)
  • Do you know the maximum number of drugs per patient?  (It might be 5, but I just wanted to check.)
  • If there is a combination of AB for 10 days, then CD for the next 20 days, does that still count as a block of 30 days with 2+ drugs?  (If not, a little more explanation would be in order.)

 

 

Contributor m_o
Contributor
Posts: 30

Re: Concomitant medication use

[ Edited ]
Posted in reply to Astounding

Thank you for your reply. Yes, the process breaks down at the first step due to the large number of rows it tries to generate. Regarding your second point, if a patient has been on combination AB for 10 days and combination CD for 20 days, then I would disregard those combinations because in both cases, the patient has been on a unique drug combination for <30 days. However, if a patient has been on AB 10 days but ABC for >=30 days, then I will count ABC.

I am not sure about the maximum number of overlaps. Probably about 5-6. I will shorten the names of the drugs to single letters or single digits in place of the full drug name. Thank you very much!

Super User
Posts: 19,770

Re: Concomitant medication use

I still don't understand how it would crash. 40 million rows shouldn't be too much, it's annoying and large - takes about <20 mins to process analytically our sites on a standard desktop. If its failing at the sort, you could change the sortsize option to help with that issue.

Frequent Contributor
Posts: 93

Re: Concomitant medication use

For your consideration, proc sql code that displays number of days overlap among single-subject drug periods that overlap:

 

data have;  
input ID DRUG $ START_DT MMDDYY10. DAYS_SUPP  END_DT MMDDYY10. ;
FORMAT START_DT END_DT MMDDYY10.;
cards; 
1	A	2/17/10	30	3/19/10
1	B	5/6/09	30	6/5/09
1	C	7/9/11	60	9/7/11
1	E	3/1/10	90	5/30/10
1	B	1/1/09	90	4/1/09
1	D	2/1/09	30	3/3/09
1	C	5/6/12	90	8/4/12
2	B	4/1/12	60	5/31/12
2	A	7/1/10	30	7/31/10
2	C	8/3/10	90	11/1/10
2	D	11/1/13	90	1/30/14
2	E	12/5/13	90	3/5/14
2	A	2/1/11	90	5/2/11
;
RUN;

PROC SQL;
CREATE TABLE HAVE AS
SELECT MONOTONIC() AS ROWID, * FROM HAVE;
QUIT;

PROC SQL;
CREATE TABLE OVERLAP AS
SELECT HAVE.*, HAVE2.ROWID AS ROWID2, HAVE2.ID AS ID2, HAVE2.START_DT AS START2, HAVE2.DAYS_SUPP AS DAYS2, HAVE2.END_DT AS END2 
, CASE WHEN HAVE.START_DT > HAVE2.START_DT THEN HAVE2.END_DT - HAVE.START_DT  ELSE  HAVE.END_DT -  HAVE2.START_DT END AS OVERLAP
FROM HAVE, HAVE AS HAVE2
WHERE HAVE.ROWID NE HAVE2.ROWID
AND HAVE.ID = HAVE2.ID
AND (HAVE.START_DT <= HAVE2.END_DT) and (HAVE2.START_DT <= HAVE.END_DT);
QUIT;
Contributor m_o
Contributor
Posts: 30

Re: Concomitant medication use

[ Edited ]
Posted in reply to thomp7050

Thank you for your response. This is very helpful. I noticed an error in your output in that, in rows 3 and 4 of the output, the actual overlap duration between the 2 drugs are between 2/1/2009 and 3/3/2009, which is 30 days, not 59 days. I was wondering how I could use CASE WHEN to take care of the other two scenarios where duration of drugA can be completely within duration of drugB and vice versa.

Solution
‎05-01-2017 11:01 AM
Frequent Contributor
Posts: 93

Re: Concomitant medication use

This is not exactly elegant but it works:

 

data have;  
input ID DRUG $ START_DT MMDDYY10. DAYS_SUPP  END_DT MMDDYY10. ;
FORMAT START_DT END_DT MMDDYY10.;
cards; 
1	A	2/17/10	30	3/19/10
1	B	5/6/09	30	6/5/09
1	C	7/9/11	60	9/7/11
1	E	3/1/10	90	5/30/10
1	B	1/1/09	90	4/1/09
1	D	2/1/09	30	3/3/09
1	C	5/6/12	90	8/4/12
2	B	4/1/12	60	5/31/12
2	A	7/1/10	30	7/31/10
2	C	8/3/10	90	11/1/10
2	D	11/1/13	90	1/30/14
2	E	12/5/13	90	3/5/14
2	A	2/1/11	90	5/2/11
;
RUN;

PROC SQL;
CREATE TABLE HAVE AS
SELECT MONOTONIC() AS ROWID, * FROM HAVE;
QUIT;

%let fromDate = 01Jan2000;
%let toDate = 31DEC2016;
data alldates;
  length key 8; 
  do date = "&fromDate"d to "&toDate"d; 
    key + 1; 
    RSA_WorkdayInd = ( 2 <= weekday(date) <= 6);
    WeekendInd = (not RSA_WorkdayInd);
    Calendar_Week_Number = week(date, "V");
    DayOfWeek = put(date, downame3.);
    Fin_Year = year(intnx("year.7", date, 0));
    output; 
  end; 

  format 
    date date9.
  ;
run; 

PROC SQL;
CREATE TABLE OVERLAP AS
SELECT HAVE.*, HAVE2.ROWID AS ROWID2, HAVE2.ID AS ID2, HAVE2.START_DT AS START2, HAVE2.DAYS_SUPP AS DAYS2, HAVE2.END_DT AS END2 
, (SELECT COUNT(*) FROM alldates WHERE DATE >= HAVE.START_DT AND DATE >= HAVE2.START_DT AND DATE <= HAVE.END_DT AND DATE <= HAVE2.END_DT) AS OVERLAP
FROM HAVE, HAVE AS HAVE2
WHERE HAVE.ROWID NE HAVE2.ROWID
AND HAVE.ID = HAVE2.ID
AND (HAVE.START_DT <= HAVE2.END_DT) and (HAVE2.START_DT <= HAVE.END_DT);
QUIT;
Contributor m_o
Contributor
Posts: 30

Re: Concomitant medication use

Posted in reply to thomp7050
Thank you! This works! I will try to modify the sql code to also identify three or more overlaps, but this is wonderful. Thank you very much!! Smiley Happy
Contributor m_o
Contributor
Posts: 30

Re: Concomitant medication use

[ Edited ]
Posted in reply to thomp7050

Thank you very much for your help once again. I was wondering if there is a way to modify the sql code so that it outputs 2 additional columns containing the 2 dates that were used to compute that difference/overlapping days. I can do it for two combinations in a simple separate data step, but I found that it gets quite hairy when considering 3-5 combinations. Thank you very much!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 287 views
  • 2 likes
  • 5 in conversation