Solved: Flagging if condition is met

abhi309 · Posted 04-14-2023 11:11 AM

Hello All,

First of all I want to thank this forum and its members for helping out people like me who are new to SAS. So here I am again asking for help as I am stuck.

I have a claims dataset where I want to flag patients if they are taking 5 or more medications at any point of time.

I have:

PTID DRUG_CLASS DATE_FILLED DAYS_SUPPLY

1 A 14JUN2018 30

1 B 14JUN2018 45

1 C 30JUN2018 30

1 D 30JUN2018 30

1 E 30AUG2018 10

1 B 30AUG2018 30

1 C 07SEP2018 30

2 B 02JUN2018 45

2 F 02JUN2018 30

2 H 10JUN2018 15

2 C 10JUN2018 30

2 A 25JUN2018 15

2 H 25JUN2018 15

2 B 25JUN2018 30

2 A 10JUL2018 7

2 G 10JUL2018 30

2 H 10JUL2018 30

2 B 10JUL2018 30

2 H 10JUL2018 15

3 J 04JUN2018 10

3 B 04JUN2018 30

3 C 04JUN2018 30

3 J 10JUN2018 30

3 B 10JUN2018 10

3 C 10JUN2018 10

3 R 13JUL2018 15

3 K 13JUL2018 7

4 T 09JUL2018 90

4 K 09JUL2018 30

4 N 09JUL2018 30

5 B 22JUN2018 15

5 H 09JUL2018 30

5 J 09JUL2018 30

5 A 09JUL2018 45

5 B 22JUL2018 15

5 C 22JUL2018 30

5 H 08AUG2018 30

5 B 08AUG2018 15

5 N 10AUG2018 7

5 H 10SEP2018 30

5 B 10SEP2018 30

Also if any patient has filled the second prescription of the same class before his pills in hand are over we have to consider that fact as well. Like in this dataset patient id 3 filled drugs J, B and C for the second time before his in hands meds are over. So in JUNE 2018 he filled the prescription twice and therefore his J, B and C class of drugs will be over by 14th July 2018 and since he filled another two class of drugs on 13th of July 2018 therefore he should get flagged.

AhmedAl_Attar · Posted 04-15-2023 01:10 AM

Hi @abhi309 ,

This is how I approached this problem

DATA have;
	LENGTH PTID 3
		DRUG_CLASS $1
		DATE_FILLED 8
		DAYS_SUPPLY 3;
		
	FORMAT DATE_FILLED date9.;
	INPUT PTID :1. DRUG_CLASS :$1. DATE_FILLED :date9. DAYS_SUPPLY ;
datalines;	
1 A 14JUN2018 30
1 B 14JUN2018 45
1 C 30JUN2018 30
1 D 30JUN2018 30
1 E 30AUG2018 10
1 B 30AUG2018 30
1 C 07SEP2018 30
2 B 02JUN2018 45
2 F 02JUN2018 30
2 H 10JUN2018 15
2 C 10JUN2018 30
2 A 25JUN2018 15
2 H 25JUN2018 15
2 B 25JUN2018 30
2 A 10JUL2018 7
2 G 10JUL2018 30
2 H 10JUL2018 30
2 B 10JUL2018 30
2 H 10JUL2018 15
3 J 04JUN2018 10
3 B 04JUN2018 30
3 C 04JUN2018 30
3 J 10JUN2018 30
3 B 10JUN2018 10
3 C 10JUN2018 10
3 R 13JUL2018 15
3 K 13JUL2018 7
4 T 09JUL2018 90
4 K 09JUL2018 30
4 N 09JUL2018 30
5 B 22JUN2018 15
5 H 09JUL2018 30
5 J 09JUL2018 30
5 A 09JUL2018 45
5 B 22JUL2018 15
5 C 22JUL2018 30
5 H 08AUG2018 30
5 B 08AUG2018 15
5 N 10AUG2018 7
5 H 10SEP2018 30
5 B 10SEP2018 30
;
run;

PROC SORT Data=work.have;
	BY PTID DRUG_CLASS DATE_FILLED;
run;

proc sql noprint;
	select max (cnt)
	into :g_maxCnt trimmed
	from
		(select count(*) as cnt
		from work.have
		group by PTID,DRUG_CLASS)
	;
quit;

DATA want(KEEP=PTID DRUG_CLASS dt: sp: df: flag);

	if (0) then SET work.have;
	ARRAY dates {&g_maxCnt} 8 dt1 - dt&g_maxCnt;
	ARRAY supplies {&g_maxCnt} 3 sp1 - sp&g_maxCnt;
	ARRAY diffs {%eval(&g_maxCnt-1)} 4 df1 - df%eval(&g_maxCnt-1);
	LENGTH flag 3;
	
	FORMAT dt: date9.;

	flag=0;
	do _n_=1 by 1 until (last.drug_class);
		SET work.have;
		BY PTID DRUG_CLASS DATE_FILLED;
		dates[_n_] = DATE_FILLED;
		supplies[_n_] = DAYS_SUPPLY;
		if(_n_ GT 1) then
		do;
			diffs[(_n_-1)] = dates[_n_] - dates[(_n_-1)];
			if (flag NE 1) then
				flag = ifn (0 < diffs[(_n_-1)] < supplies[(_n_-1)],1,0) ; 
		end;
	end;
	OUTPUT;
RUN;

PROC SORT Data=work.want;
	BY PTID DRUG_CLASS dt1;
run;

proc sql;
	select ptid, sum(flag)
	from want
	where flag = 1
	group by ptid;
quit;

Hope this helps

View solution in original post

ballardw · Posted 04-14-2023 11:23 AM

How many records are you dealing with? One of the easiest to code and follow is to expand each prescription to one record per day based on the fill date and the duration then count by day the number of prescriptions. But if the data set is already large it will get pretty big as each of those is going to average multiplying the data set by about 30 times.

Also, I am not sure that I follow exactly how you are handling the overlap of the same med. Several of my medications are refilled 15 to 20 days before the prior fill is exhausted to make sure that I don't get a lapse in medication. So an "overlap" of those is actually more of an extension of the previous fill, at least in my sight.

abhi309 · Posted 04-14-2023 11:28 AM

Thank you @ballardw for your quick reply. The dataset is very big, so if I expand the dataset it will be very big.

You are right about the overlap. It does mean extension of the previous fill.

abhi309 · Posted 04-14-2023 07:05 PM

@ballardw I want to try and expand the dataset as you suggested. Can you please give me an idea how to do that? Thank you

SASKiwi · Posted 04-14-2023 08:50 PM

Something like this will expand your data. I assumed that the Date_Filled will also be the first day of supply, tracked using Date_Now.

data want;
  set have;
  by PTID DRUG_CLASS;
  format Date_Now date9.;
  if first.DRUG_CLASS then do day = 1 to DAYS_SUPPLY;
    Date_Now = Date_Filled + day -1;
    output;
  end;
run;

abhi309 · Posted 04-19-2023 09:45 PM

Thank you @SASKiwi. I expanded the dataset and transposed it. How can I flag the patients who are taking 5 or more med class for 180 days or more.

SASKiwi · Posted 04-19-2023 10:22 PM

I wouldn't transpose it. I would do something like this to get the drug classes per day by patient:

proc sql;
create table want2 as
  select  PTID
         ,Date_Now
         ,count(distinct DRUG_CLASS) as DRUG_CLASS_Count 
from want
group by  PTID
         ,Date_Now
;
quit;

AhmedAl_Attar · Posted 04-15-2023 01:10 AM

Hi @abhi309 ,

This is how I approached this problem

DATA have;
	LENGTH PTID 3
		DRUG_CLASS $1
		DATE_FILLED 8
		DAYS_SUPPLY 3;
		
	FORMAT DATE_FILLED date9.;
	INPUT PTID :1. DRUG_CLASS :$1. DATE_FILLED :date9. DAYS_SUPPLY ;
datalines;	
1 A 14JUN2018 30
1 B 14JUN2018 45
1 C 30JUN2018 30
1 D 30JUN2018 30
1 E 30AUG2018 10
1 B 30AUG2018 30
1 C 07SEP2018 30
2 B 02JUN2018 45
2 F 02JUN2018 30
2 H 10JUN2018 15
2 C 10JUN2018 30
2 A 25JUN2018 15
2 H 25JUN2018 15
2 B 25JUN2018 30
2 A 10JUL2018 7
2 G 10JUL2018 30
2 H 10JUL2018 30
2 B 10JUL2018 30
2 H 10JUL2018 15
3 J 04JUN2018 10
3 B 04JUN2018 30
3 C 04JUN2018 30
3 J 10JUN2018 30
3 B 10JUN2018 10
3 C 10JUN2018 10
3 R 13JUL2018 15
3 K 13JUL2018 7
4 T 09JUL2018 90
4 K 09JUL2018 30
4 N 09JUL2018 30
5 B 22JUN2018 15
5 H 09JUL2018 30
5 J 09JUL2018 30
5 A 09JUL2018 45
5 B 22JUL2018 15
5 C 22JUL2018 30
5 H 08AUG2018 30
5 B 08AUG2018 15
5 N 10AUG2018 7
5 H 10SEP2018 30
5 B 10SEP2018 30
;
run;

PROC SORT Data=work.have;
	BY PTID DRUG_CLASS DATE_FILLED;
run;

proc sql noprint;
	select max (cnt)
	into :g_maxCnt trimmed
	from
		(select count(*) as cnt
		from work.have
		group by PTID,DRUG_CLASS)
	;
quit;

DATA want(KEEP=PTID DRUG_CLASS dt: sp: df: flag);

	if (0) then SET work.have;
	ARRAY dates {&g_maxCnt} 8 dt1 - dt&g_maxCnt;
	ARRAY supplies {&g_maxCnt} 3 sp1 - sp&g_maxCnt;
	ARRAY diffs {%eval(&g_maxCnt-1)} 4 df1 - df%eval(&g_maxCnt-1);
	LENGTH flag 3;
	
	FORMAT dt: date9.;

	flag=0;
	do _n_=1 by 1 until (last.drug_class);
		SET work.have;
		BY PTID DRUG_CLASS DATE_FILLED;
		dates[_n_] = DATE_FILLED;
		supplies[_n_] = DAYS_SUPPLY;
		if(_n_ GT 1) then
		do;
			diffs[(_n_-1)] = dates[_n_] - dates[(_n_-1)];
			if (flag NE 1) then
				flag = ifn (0 < diffs[(_n_-1)] < supplies[(_n_-1)],1,0) ; 
		end;
	end;
	OUTPUT;
RUN;

PROC SORT Data=work.want;
	BY PTID DRUG_CLASS dt1;
run;

proc sql;
	select ptid, sum(flag)
	from want
	where flag = 1
	group by ptid;
quit;

Hope this helps

abhi309 · Posted 04-19-2023 09:46 PM

Thank you @AhmedAl_Attar for your reply. I tried what you suggested but I didn't the desired result.

ballardw · Posted 04-19-2023 10:11 PM

@abhi309 wrote:

Thank you @AhmedAl_Attar for your reply. I tried what you suggested but I didn't the desired result.

Then show the desired result for the given example data.

If you can't determine that, then maybe use a smaller example data set that you can show the result.

abhi309 · Posted 04-19-2023 11:02 PM

@ballardw I would have posted the results but the data I wanted was too complicated to post. Although I have made slight change in the method section of analysis, now what I want with the same data is, to flag the patient if he is taking 5 or more med class for 90 days or more.

So the data I got so far is;

PTID DRUG_CLASS TOTAL_DAYSUPPLY

1 A 30

1 B 75

1 C 60

1 D 30

1 E 10

2 A 22

2 B 105

2 C 30

2 F 30

2 G 30

2 H 75

3 B 40

3 C 40

3 J 40

3 K 7

3 R 15

4 K 30

4 N 30

4 T 90

5 A 45

5 B 75

5 C 30

5 H 90

5 J 30

5 N 7

6 A 45

6 B 90

6 C 90

6 D 120

6 E 90

6 F 200

I have added one more patient (PT ID: 6). I have calculated total number of supply days for each medication class. Now I want to flag the patients who are taking 5 or more drugs for 90 or more days. Like in the above data only patient 6 will be flagged since he has filled B, C and E for 90 days and drug D and F for 120 and 200 days respectively. Other patients don't get flagged.

Hope this clarifies what I want.

Thank you again everyone

ballardw · Posted 04-20-2023 12:27 AM

With that data, if I understand what you want which is questionable, this may help.

data have;
input PTID $ DRUG_CLASS $ TOTAL_DAYSUPPLY;
datalines;
1 A 30
1 B 75
1 C 60
1 D 30
1 E 10
2 A 22
2 B 105
2 C 30
2 F 30
2 G 30 
2 H 75 
3 B 40 
3 C 40
3 J 40
3 K 7 
3 R 15 
4 K 30 
4 N 30
4 T 90 
5 A 45 
5 B 75
5 C 30
5 H 90 
5 J 30
5 N 7
6 A 45
6 B 90
6 C 90
6 D 120
6 E 90
6 F 200
;

proc summary data=have nway;
  class ptid  DRUG_CLASS;
  var  TOTAL_DAYSUPPLY;
  output out=summed (drop=_: where=(TOTAL_DAYSUPPLY ge 90)) sum=;
run;

proc sql;
   create table want as
   select distinct ptid
   from  summed 
   group by ptid
   having count(DRUG_CLASS) ge 5
   ;
quit;

You probably should look at the Summed set created by proc summary. It is getting the total of the days supply for each ptid and drug_class. The Where is a data set option on the out= bit that selects only the drugs with 90+ days total.

From that point there are several ways to see if the count of Drug_class is 5 or greater.

If you can't create a data step please at least paste the data as text into a text box created by using the </> icon and pasting text.

abhi309 · Posted 04-20-2023 10:33 AM

Thank you @ballardw this works.

Sorry I accepted the other reply as solution by mistake. I don't know how to change that.

Thanks again

AhmedAl_Attar · Posted 04-20-2023 05:51 AM

Hi @abhi309

This will give you what you are looking for

data have;
input PTID $ DRUG_CLASS $ TOTAL_DAYSUPPLY;
datalines;
1 A 30
1 B 75
1 C 60
1 D 30
1 E 10
2 A 22
2 B 105
2 C 30
2 F 30
2 G 30 
2 H 75 
3 B 40 
3 C 40
3 J 40
3 K 7 
3 R 15 
4 K 30 
4 N 30
4 T 90 
5 A 45 
5 B 75
5 C 30
5 H 90 
5 J 30
5 N 7
6 A 45
6 B 90
6 C 90
6 D 120
6 E 90
6 F 200
;
run;

proc sql;
	create table want as
	select agg.ptid
		, agg.drug_cnt
		, agg._90Plus_cnt
	from
         /* in-line aggregation */
		(select ptid
		, count(*) as drug_cnt
		, sum(CASE WHEN TOTAL_DAYSUPPLY >= 90 then 1 else 0 end) as _90Plus_cnt
		from have
		group by ptid) agg
    /* Desired filtering condition */
	where agg.drug_cnt GE 5 
    and agg._90Plus_cnt GE 5
	order by ptid;
quit;

abhi309 · Posted 04-20-2023 10:51 AM

Thank you @AhmedAl_Attar this works. I really appreciate your time and help.

Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

Re: Flagging if condition is met

SAS Innovate 2025: Call for Content

Classroom Training Available!