DATA Step, Macro, Functions and more

Concomitant medication use

Accepted Solution Solved
Reply
Contributor m_o
Contributor
Posts: 20
Accepted Solution

Concomitant medication use

Hello,

I am trying to define concomitant use of drugs. 

I have a great solution to my problem with a SAS data step at the link below:

https://communities.sas.com/t5/Base-SAS-Programming/Concomitant-drug-medication-use/m-p/351710#M8188...

However, the only issue is that I have a very large dataset (~600K rows) and those codes require a ton of memory and my computer crashes. I wonder if anybody could suggest an alternative, perhaps a proc sql solution, that would not require as much memory and computational capacity. 

Thank you very much! Any help would be greatly appreciated!

 

Katie

 


Accepted Solutions
Solution
‎05-01-2017 11:01 AM
Frequent Contributor
Posts: 93

Re: Concomitant medication use

This is not exactly elegant but it works:

 

data have;  
input ID DRUG $ START_DT MMDDYY10. DAYS_SUPP  END_DT MMDDYY10. ;
FORMAT START_DT END_DT MMDDYY10.;
cards; 
1	A	2/17/10	30	3/19/10
1	B	5/6/09	30	6/5/09
1	C	7/9/11	60	9/7/11
1	E	3/1/10	90	5/30/10
1	B	1/1/09	90	4/1/09
1	D	2/1/09	30	3/3/09
1	C	5/6/12	90	8/4/12
2	B	4/1/12	60	5/31/12
2	A	7/1/10	30	7/31/10
2	C	8/3/10	90	11/1/10
2	D	11/1/13	90	1/30/14
2	E	12/5/13	90	3/5/14
2	A	2/1/11	90	5/2/11
;
RUN;

PROC SQL;
CREATE TABLE HAVE AS
SELECT MONOTONIC() AS ROWID, * FROM HAVE;
QUIT;

%let fromDate = 01Jan2000;
%let toDate = 31DEC2016;
data alldates;
  length key 8; 
  do date = "&fromDate"d to "&toDate"d; 
    key + 1; 
    RSA_WorkdayInd = ( 2 <= weekday(date) <= 6);
    WeekendInd = (not RSA_WorkdayInd);
    Calendar_Week_Number = week(date, "V");
    DayOfWeek = put(date, downame3.);
    Fin_Year = year(intnx("year.7", date, 0));
    output; 
  end; 

  format 
    date date9.
  ;
run; 

PROC SQL;
CREATE TABLE OVERLAP AS
SELECT HAVE.*, HAVE2.ROWID AS ROWID2, HAVE2.ID AS ID2, HAVE2.START_DT AS START2, HAVE2.DAYS_SUPP AS DAYS2, HAVE2.END_DT AS END2 
, (SELECT COUNT(*) FROM alldates WHERE DATE >= HAVE.START_DT AND DATE >= HAVE2.START_DT AND DATE <= HAVE.END_DT AND DATE <= HAVE2.END_DT) AS OVERLAP
FROM HAVE, HAVE AS HAVE2
WHERE HAVE.ROWID NE HAVE2.ROWID
AND HAVE.ID = HAVE2.ID
AND (HAVE.START_DT <= HAVE2.END_DT) and (HAVE2.START_DT <= HAVE.END_DT);
QUIT;

View solution in original post


All Replies
Super User
Posts: 17,963

Re: Concomitant medication use

600K rows should not be causing issues. 

Can you identify which step causes the issues?

Contributor m_o
Contributor
Posts: 20

Re: Concomitant medication use

The first step. Thank you.
Respected Advisor
Posts: 4,663

Re: Concomitant medication use

The first step can be slightly optimized as:

 

data days(sortedby=id);
length drug $1;
set have;
do day = start_dt to end_dt;
    output;
    end;
format day yymmdd10.;
keep id drug day;
run;

The sortedby= dataset option might improve the sort performance of the following step.

Droping the useless variable dum saves space. (It was a remnant of a previous version of the code)

Defining the length of drug as 1 character will also save space.

 

See if you can get the first step and sort to work. If so, we will try to optimize the rest.

PG
Contributor m_o
Contributor
Posts: 20

Re: Concomitant medication use

Thank you very much for your response. I tried as you suggested but my SAS stopped working after processing about a billion rows. I might go the proc sql route instead. Thanks again!
Super User
Posts: 5,099

Re: Concomitant medication use

Regarding your link ...

 

The first step is a good idea at least in theory.  But it does expand the data set to perhaps 40M rows.  Is that where the process breaks down?

 

The second step is also a good idea (sorting the 40M rows).  Again, is this the source of the problem?

 

I'm hoping the problems lie in the DATA step that follows sorting.  Questions for that:

 

  • Are the drug names really A, B, C, D, E?  (If not, how many characters are required for each name?)
  • Do you know the maximum number of drugs per patient?  (It might be 5, but I just wanted to check.)
  • If there is a combination of AB for 10 days, then CD for the next 20 days, does that still count as a block of 30 days with 2+ drugs?  (If not, a little more explanation would be in order.)

 

 

Contributor m_o
Contributor
Posts: 20

Re: Concomitant medication use

[ Edited ]

Thank you for your reply. Yes, the process breaks down at the first step due to the large number of rows it tries to generate. Regarding your second point, if a patient has been on combination AB for 10 days and combination CD for 20 days, then I would disregard those combinations because in both cases, the patient has been on a unique drug combination for <30 days. However, if a patient has been on AB 10 days but ABC for >=30 days, then I will count ABC.

I am not sure about the maximum number of overlaps. Probably about 5-6. I will shorten the names of the drugs to single letters or single digits in place of the full drug name. Thank you very much!

Super User
Posts: 17,963

Re: Concomitant medication use

I still don't understand how it would crash. 40 million rows shouldn't be too much, it's annoying and large - takes about <20 mins to process analytically our sites on a standard desktop. If its failing at the sort, you could change the sortsize option to help with that issue.

Frequent Contributor
Posts: 93

Re: Concomitant medication use

For your consideration, proc sql code that displays number of days overlap among single-subject drug periods that overlap:

 

data have;  
input ID DRUG $ START_DT MMDDYY10. DAYS_SUPP  END_DT MMDDYY10. ;
FORMAT START_DT END_DT MMDDYY10.;
cards; 
1	A	2/17/10	30	3/19/10
1	B	5/6/09	30	6/5/09
1	C	7/9/11	60	9/7/11
1	E	3/1/10	90	5/30/10
1	B	1/1/09	90	4/1/09
1	D	2/1/09	30	3/3/09
1	C	5/6/12	90	8/4/12
2	B	4/1/12	60	5/31/12
2	A	7/1/10	30	7/31/10
2	C	8/3/10	90	11/1/10
2	D	11/1/13	90	1/30/14
2	E	12/5/13	90	3/5/14
2	A	2/1/11	90	5/2/11
;
RUN;

PROC SQL;
CREATE TABLE HAVE AS
SELECT MONOTONIC() AS ROWID, * FROM HAVE;
QUIT;

PROC SQL;
CREATE TABLE OVERLAP AS
SELECT HAVE.*, HAVE2.ROWID AS ROWID2, HAVE2.ID AS ID2, HAVE2.START_DT AS START2, HAVE2.DAYS_SUPP AS DAYS2, HAVE2.END_DT AS END2 
, CASE WHEN HAVE.START_DT > HAVE2.START_DT THEN HAVE2.END_DT - HAVE.START_DT  ELSE  HAVE.END_DT -  HAVE2.START_DT END AS OVERLAP
FROM HAVE, HAVE AS HAVE2
WHERE HAVE.ROWID NE HAVE2.ROWID
AND HAVE.ID = HAVE2.ID
AND (HAVE.START_DT <= HAVE2.END_DT) and (HAVE2.START_DT <= HAVE.END_DT);
QUIT;
Contributor m_o
Contributor
Posts: 20

Re: Concomitant medication use

[ Edited ]

Thank you for your response. This is very helpful. I noticed an error in your output in that, in rows 3 and 4 of the output, the actual overlap duration between the 2 drugs are between 2/1/2009 and 3/3/2009, which is 30 days, not 59 days. I was wondering how I could use CASE WHEN to take care of the other two scenarios where duration of drugA can be completely within duration of drugB and vice versa.

Solution
‎05-01-2017 11:01 AM
Frequent Contributor
Posts: 93

Re: Concomitant medication use

This is not exactly elegant but it works:

 

data have;  
input ID DRUG $ START_DT MMDDYY10. DAYS_SUPP  END_DT MMDDYY10. ;
FORMAT START_DT END_DT MMDDYY10.;
cards; 
1	A	2/17/10	30	3/19/10
1	B	5/6/09	30	6/5/09
1	C	7/9/11	60	9/7/11
1	E	3/1/10	90	5/30/10
1	B	1/1/09	90	4/1/09
1	D	2/1/09	30	3/3/09
1	C	5/6/12	90	8/4/12
2	B	4/1/12	60	5/31/12
2	A	7/1/10	30	7/31/10
2	C	8/3/10	90	11/1/10
2	D	11/1/13	90	1/30/14
2	E	12/5/13	90	3/5/14
2	A	2/1/11	90	5/2/11
;
RUN;

PROC SQL;
CREATE TABLE HAVE AS
SELECT MONOTONIC() AS ROWID, * FROM HAVE;
QUIT;

%let fromDate = 01Jan2000;
%let toDate = 31DEC2016;
data alldates;
  length key 8; 
  do date = "&fromDate"d to "&toDate"d; 
    key + 1; 
    RSA_WorkdayInd = ( 2 <= weekday(date) <= 6);
    WeekendInd = (not RSA_WorkdayInd);
    Calendar_Week_Number = week(date, "V");
    DayOfWeek = put(date, downame3.);
    Fin_Year = year(intnx("year.7", date, 0));
    output; 
  end; 

  format 
    date date9.
  ;
run; 

PROC SQL;
CREATE TABLE OVERLAP AS
SELECT HAVE.*, HAVE2.ROWID AS ROWID2, HAVE2.ID AS ID2, HAVE2.START_DT AS START2, HAVE2.DAYS_SUPP AS DAYS2, HAVE2.END_DT AS END2 
, (SELECT COUNT(*) FROM alldates WHERE DATE >= HAVE.START_DT AND DATE >= HAVE2.START_DT AND DATE <= HAVE.END_DT AND DATE <= HAVE2.END_DT) AS OVERLAP
FROM HAVE, HAVE AS HAVE2
WHERE HAVE.ROWID NE HAVE2.ROWID
AND HAVE.ID = HAVE2.ID
AND (HAVE.START_DT <= HAVE2.END_DT) and (HAVE2.START_DT <= HAVE.END_DT);
QUIT;
Contributor m_o
Contributor
Posts: 20

Re: Concomitant medication use

Thank you! This works! I will try to modify the sql code to also identify three or more overlaps, but this is wonderful. Thank you very much!! Smiley Happy
Contributor m_o
Contributor
Posts: 20

Re: Concomitant medication use

[ Edited ]

Thank you very much for your help once again. I was wondering if there is a way to modify the sql code so that it outputs 2 additional columns containing the 2 dates that were used to compute that difference/overlapping days. I can do it for two combinations in a simple separate data step, but I found that it gets quite hairy when considering 3-5 combinations. Thank you very much!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 254 views
  • 2 likes
  • 5 in conversation