I'm not sure if what I'm trying to do is possible but, any help would be greatly appreciated.
I have several rows of data. What I would like to do is calculate the number of days between each date by ID
and create the following flags:
le 180 days = 6 months
181 days to 360 days = 12 months
361 days to 540 days = 18 months
541 days to 720 days = 24 months
ge 721 days = gt two years
For example, for ID 2 12/13/2016 - 03/14/2017 = 91 days, 03/14/2017-07/11/2017 = 120 so 120+90=210 therefore I should get
a flag then continue on. The sample data is below.
data mydata;
input ID DATE:mmddyy10. ;
format DATE mmddyy10.;
datalines;
1 07/31/2017
1 08/14/2017
1 08/15/2017
1 08/21/2017
1 09/08/2017
1 10/27/2017
1 01/03/2019
1 02/28/2019
1 03/01/2019
1 03/18/2019
2 12/13/2016
2 03/14/2017
2 07/11/2017
2 07/12/2017
2 08/22/2017
3 12/15/2017
4 10/25/2016
5 11/30/2016
5 11/30/2016
5 12/12/2016
5 12/21/2016
5 02/17/2017
5 03/24/2017
6 12/22/2020
7 03/26/2019
7 04/08/2019
8 01/27/2021
8 02/04/2021
9 07/12/2016
10 05/04/2018
10 05/25/2018
10 09/24/2019
You can get the intervals this way:
data want; set mydata; by id; retain firstdate; if first.id then firstdate=date; interval = date-firstdate; run;
Personally I would use a custom format to assign a "flag" value based on the interval. That way when you decide to consider a different set of "flags" then you can use a different format.
The format would look something like:
proc format; value flag 0- 180 = ' 6 months' 181 - 360 = '12 months' 361 - 540 = '18 months' 541 - 720 = '24 months' 721 - high = 'GT two years' ; run;
Assign the format to the interval as needed.
Groups created by formats are honored by reporting procedures like Report and Tabulate, most analysis procedures and most places in graphs.
You can get the intervals this way:
data want; set mydata; by id; retain firstdate; if first.id then firstdate=date; interval = date-firstdate; run;
Personally I would use a custom format to assign a "flag" value based on the interval. That way when you decide to consider a different set of "flags" then you can use a different format.
The format would look something like:
proc format; value flag 0- 180 = ' 6 months' 181 - 360 = '12 months' 361 - 540 = '18 months' 541 - 720 = '24 months' 721 - high = 'GT two years' ; run;
Assign the format to the interval as needed.
Groups created by formats are honored by reporting procedures like Report and Tabulate, most analysis procedures and most places in graphs.
More to Rezza's point:
This is a great example to use INTCK() with the CONTINUOUS option. INTCK() should have the advantage that there will not be "discontinuities" in counting days/months based on leap years or for which month a count starts in.
proc format;
value flag
0 = ' 6 months'
1 = '12 months'
2 = '18 months'
3 = '24 months'
4 = 'GT two years'
;
run;
data want;
if 0 then set have(keep=ID);
retain firstdate;
format firstdate mmddyy10.;
do until (last.ID);
set have;
by ID;
interval_months = intck('month6',firstdate,date,"continuous");
format interval_months flag.;
label interval_months="number of 6-month periods";
if first.ID
then firstdate=date;
else output;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.