Hi,
I have following data. need to identify the date when sum of daily med reached 200 or above.
ID | Fill_DT | Days_Supply | END_DT | MED |
1 | 1/12/2017 | 30 | 2/10/2017 | 80 |
1 | 1/15/2017 | 30 | 2/13/2017 | 120 |
2 | 1/1/2017 | 15 | 1/15/2017 | 100 |
2 | 1/13/2017 | 20 | 2/1/2017 | 80 |
3 | 1/1/2017 | 30 | 1/30/2017 | 60 |
3 | 1/15/2017 | 30 | 2/13/2017 | 60 |
3 | 1/25/2017 | 30 | 2/23/2017 | 100 |
i need data as follows
ID | Date_reached_200 | MED | ||
1 | 1/15/2017 | 200 | ||
3 | 1/25/2017 | 220 |
Thanks
You might need to make sure you output just one observation per ID. Along those lines:
data want;
set have;
by ID;
if first.id then do;
tot_med=0;
flag=0;
end;
tot_med + med;
if tot_med >= 200 then flag + 1;
if flag=1 then output;
keep id fill_dt tot_med;
rename fill_dt = Date_reached_200;
run;
That gives you the first date when the ID hit 200 total.
data have;
input ID Fill_DT : mmddyy10. Days_Supply END_DT : mmddyy10. MED;
datalines;
1 1/12/2017 30 2/10/2017 80
1 1/15/2017 30 2/13/2017 120
2 1/1/2017 15 1/15/2017 100
2 1/13/2017 20 2/1/2017 80
3 1/1/2017 30 1/30/2017 60
3 1/15/2017 30 2/13/2017 60
3 1/25/2017 30 2/23/2017 100
;
data want;
set have;
by id;
if first.id then sum=0;
sum+MED;
if sum>=200 then output;
format Fill_DT END_DT mmddyy10.;
run;
You might need to make sure you output just one observation per ID. Along those lines:
data want;
set have;
by ID;
if first.id then do;
tot_med=0;
flag=0;
end;
tot_med + med;
if tot_med >= 200 then flag + 1;
if flag=1 then output;
keep id fill_dt tot_med;
rename fill_dt = Date_reached_200;
run;
That gives you the first date when the ID hit 200 total.
We can only test with the sample you give us.
Here is my log and output
5361 data _null_;
5362 if _n_=1 then do;
5363 if 0 then set have;
5364 declare hash h(dataset:'have(obs=0)',multidata:'y',ordered:'y');
5365 rc = h.defineKey('id');
5366 rc = h.defineData(all:'y');
5367 rc = h.defineDone();
5368 end;
5369 sum=0;
5370 do until(last.id);
5371 set have;
5372 by id;
5373 sum+med;
5374 if h.check() ne 0 and sum>=200 then h.add();
5375 end;
5376 h.output(dataset:'want');
5377 format Fill_DT END_DT mmddyy10.;
5378 run;
NOTE: There were 0 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 1 observations and 5 variables.
NOTE: The data set WORK.WANT has 1 observations and 5 variables.
NOTE: The data set WORK.WANT has 2 observations and 5 variables.
NOTE: There were 7 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
ID | Fill_DT | Days_Supply | END_DT | MED |
1 | 1/15/2017 | 30 | 2/13/2017 | 120 |
3 | 1/25/2017 | 30 | 2/23/2017 | 100 |
If I understand the question, the programming would be fairly similar:
data want;
set have;
by ID;
if first.id then do;
tot_med=0;
flag=0;
end;
if flag=0;
start_dt = end_dt - days_supply + 1;
do Date_reached_200 = start_dt to End_dt;
tot_med + med;
if tot_med >= 200 then flag + 1;
if flag=1 then output;
end;
keep id Date_reached_200 tot_med;
run;
The reason for computing Start_Dt is that the date ranges might overlap. In your sample data, for example, you can't use the fill date as the starting point. Usually a person refills the prescription a few days before running out. If you are not careful, you will count the person as taking both the original plus the refill on the same date. Here, I'm assuming the patient uses up the first prescription before starting on the second prescription.
The end_date shown here is also a computed from fill_dt. In general i only have fill_dt. So,l am always assuming start as fill_date and end_date as fill_dt+days_supply-1.
Revised one last time. You really need to explain what is in your data instead of adding bits and pieces along the way.
data want;
set have;
by ID;
if first.id then do;
tot_med=0;
flag=0;
Date_reached_200 = Fill_dt - 1;
end;
if flag=0;
do day = 1 to days_supply;
tot_med + med;
Date_reached_200 + 1;
if tot_med >= 200 then flag + 1;
if flag=1 then do;
output;
delete;
end;
end;
keep id Date_reached_200 tot_med;
run;
sorry for bits and pieces.
i am trying to look if at any point of time tot_MED for that day has reached above 200 or not with in reporting period (7/1/17 - 12/31/17). if the fill date is before reporting period and days supply extending into reporting period i am considering bg_date as reporting period begin date instead of actual fill_Dt and same for that extending out side reporting period.
ID | Days_supply | MED | Fill_DT | end_Date | bg_Date |
1 | 30 | 22.5 | 6/30/2017 | 7/29/2017 | 7/1/2017 |
1 | 30 | 180 | 6/30/2017 | 7/29/2017 | 7/1/2017 |
1 | 30 | 22.5 | 7/31/2017 | 8/29/2017 | 7/31/2017 |
1 | 30 | 22.5 | 9/1/2017 | 9/30/2017 | 9/1/2017 |
1 | 30 | 22.5 | 9/29/2017 | 10/28/2017 | 9/29/2017 |
1 | 30 | 180 | 9/29/2017 | 10/28/2017 | 9/29/2017 |
2 | 30 | 37.5 | 6/30/2017 | 7/29/2017 | 7/1/2017 |
2 | 30 | 37.5 | 7/31/2017 | 8/29/2017 | 7/31/2017 |
2 | 30 | 37.5 | 8/31/2017 | 9/29/2017 | 8/31/2017 |
2 | 30 | 37.5 | 10/2/2017 | 10/31/2017 | 10/2/2017 |
2 | 30 | 37.5 | 10/31/2017 | 11/29/2017 | 10/31/2017 |
2 | 30 | 37.5 | 11/29/2017 | 12/28/2017 | 11/29/2017 |
2 | 30 | 37.5 | 12/27/2017 | 12/31/2017 | 12/27/2017 |
3 | 3 | 18 | 7/18/2017 | 7/20/2017 | 7/18/2017 |
3 | 2 | 50 | 8/12/2017 | 8/13/2017 | 8/12/2017 |
4 | 5 | 15 | 10/4/2017 | 10/8/2017 | 10/4/2017 |
5 | 90 | 7.5 | 6/21/2017 | 9/18/2017 | 7/1/2017 |
5 | 90 | 15 | 9/22/2017 | 12/20/2017 | 9/22/2017 |
6 | 30 | 60 | 6/13/2017 | 7/12/2017 | 7/1/2017 |
6 | 30 | 60 | 7/14/2017 | 8/12/2017 | 7/14/2017 |
6 | 30 | 60 | 8/10/2017 | 9/8/2017 | 8/10/2017 |
6 | 30 | 60 | 9/11/2017 | 10/10/2017 | 9/11/2017 |
7 | 30 | 160 | 6/6/2017 | 7/5/2017 | 7/1/2017 |
7 | 30 | 160 | 7/6/2017 | 8/4/2017 | 7/6/2017 |
7 | 30 | 160 | 8/4/2017 | 9/2/2017 | 8/4/2017 |
7 | 30 | 45 | 9/2/2017 | 10/1/2017 | 9/2/2017 |
7 | 30 | 45 | 10/1/2017 | 10/30/2017 | 10/1/2017 |
7 | 30 | 45 | 10/30/2017 | 11/28/2017 | 10/30/2017 |
7 | 30 | 45 | 11/28/2017 | 12/27/2017 | 11/28/2017 |
data _null_;
if _n_=1 then do;
if 0 then set have;
declare hash h(dataset:'have(obs=0)',multidata:'y',ordered:'y');
rc = h.defineKey('id');
rc = h.defineData(all:'y');
rc = h.defineDone();
end;
sum=0;
do until(last.id);
set have;
by id;
sum+med;
if h.check() ne 0 and sum>=200 then h.add();
end;
h.output(dataset:'want');
format Fill_DT END_DT mmddyy10.;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.