I am trying to group dates by every ~5 years (basically creating dummy variables), but my code is not working correctly. Please help. Code is below:
Data practice; SET merge_temp6; IF NOT MISSING(DateOfInitDx1) THEN DO; IF DateOfInitDx1 <= '31DEC2005'D then date_up_to_2005 = 1; ELSE date_up_to_2005 = 0; IF DateOfInitDx1 >= '01JAN2006'D AND DateOfInitDx1 <= '31DEC2010'D then date_up_to_2010 = 1; ELSE date_up_to_2010 = 0; IF DateOfInitDx1 >= '01JAN2011'D then date_up_to_2016 = 1; ELSE date_up_to_2016 = 0; END; KEEP DateOfInitDx1 date_up_to_2005 date_up_to_2010 date_up_to_2016; RUN;
Here is the output I get, which is incorrect:
14JUN2012:00:00:00 | 0 | 0 | 1 |
24APR2008:00:00:00 | 0 | 0 | 1 |
07JAN2011:00:00:00 | 0 | 0 | 1 |
25JUN2001:00:00:00 | 0 | 0 | 1 |
19JUL2012:00:00:00 | 0 | 0 | 1 |
02AUG2011:00:00:00 | 0 | 0 | 1 |
14JUN2016:00:00:00 | 0 | 0 | 1 |
16SEP2010:00:00:00 | 0 | 0 | 1 |
21JUN2004:00:00:00 | 0 | 0 | 1 |
Here is what I want:
SAS Output
14JUN2012:00:00:00 | 0 | 1 | 0 |
24APR2008:00:00:00 | 0 | 1 | 0 |
07JAN2011:00:00:00 | 0 | 0 | 1 |
25JUN2001:00:00:00 | 1 | 0 | 0 |
19JUL2012:00:00:00 | 0 | 0 | 1 |
02AUG2011:00:00:00 | 0 | 0 | 1 |
14JUN2016:00:00:00 | 0 | 0 | 1 |
16SEP2010:00:00:00 | 0 | 1 | 1 |
21JUN2004:00:00:00 | 1 | 0 | 0 |
You're running into trouble because you are treating DateOfInitDx1 as if it's a date. But it's not ... it's a date-time.
Inside the DO group, add this statement first:
just_the_date = datepart(DateOfInitDx1);
Then use that new variable instead of DateOfInitDx1 in all of your comparisons.
You have DATETIME variables but you're treating them as DATES with the comparison.
Use DATEPART() function to get just the dates and then use YEAR() to get the years. It makes your comparison a lot simpler:
data practice;
set merge_temp6;
Year_Init_Diag = year(datepart(dateofInitDX1));
date_2005=0; date_2010=0; date_2016=0;
if year <= 2005 the date_2005 =1;
else if year <= 2010 then date_2010 =1;
else if year <=2016 then date_2016=1;
run;
@lady8506 wrote:
I am trying to group dates by every ~5 years (basically creating dummy variables), but my code is not working correctly. Please help. Code is below:
Data practice; SET merge_temp6; IF NOT MISSING(DateOfInitDx1) THEN DO; IF DateOfInitDx1 <= '31DEC2005'D then date_up_to_2005 = 1; ELSE date_up_to_2005 = 0; IF DateOfInitDx1 >= '01JAN2006'D AND DateOfInitDx1 <= '31DEC2010'D then date_up_to_2010 = 1; ELSE date_up_to_2010 = 0; IF DateOfInitDx1 >= '01JAN2011'D then date_up_to_2016 = 1; ELSE date_up_to_2016 = 0; END; KEEP DateOfInitDx1 date_up_to_2005 date_up_to_2010 date_up_to_2016; RUN;Here is the output I get, which is incorrect:
DateOfInitDx1 date_up_to_2005 date_up_to_2010 date_up_to_2016
14JUN2012:00:00:00 0 0 1 24APR2008:00:00:00 0 0 1 07JAN2011:00:00:00 0 0 1 25JUN2001:00:00:00 0 0 1 19JUL2012:00:00:00 0 0 1 02AUG2011:00:00:00 0 0 1 14JUN2016:00:00:00 0 0 1 16SEP2010:00:00:00 0 0 1 21JUN2004:00:00:00 0 0 1
Here is what I want:
SAS Output
DateOfInitDx1 date_up_to_2005 date_up_to_2010 date_up_to_2016
14JUN2012:00:00:00 0 1 0 24APR2008:00:00:00 0 1 0 07JAN2011:00:00:00 0 0 1 25JUN2001:00:00:00 1 0 0 19JUL2012:00:00:00 0 0 1 02AUG2011:00:00:00 0 0 1 14JUN2016:00:00:00 0 0 1 16SEP2010:00:00:00 0 1 1 21JUN2004:00:00:00 1 0 0
You're running into trouble because you are treating DateOfInitDx1 as if it's a date. But it's not ... it's a date-time.
Inside the DO group, add this statement first:
just_the_date = datepart(DateOfInitDx1);
Then use that new variable instead of DateOfInitDx1 in all of your comparisons.
Thank you. With your suggestion I was able to get it to work using the following code:
Data practice; SET merge_temp6; IF NOT MISSING(DateOfInitDx1) THEN DO; just_the_date = year(datepart(DateOfInitDx1)); IF just_the_date <= '2005' then date_up_to_2005 = 1; ELSE date_up_to_2005 = 0; IF just_the_date >= '2006' AND just_the_date <= '2010' then date_up_to_2010 = 1; ELSE date_up_to_2010 = 0; IF just_the_date >= '2011' then date_up_to_2016 = 1; ELSE date_up_to_2016 = 0; END; RUN;
Good job. Also note, you can get rid of those annoying messages about conversion. Just remove the quotes around the years:
2005 not '2005'
2006 not '2006'
2010 not '2010'
Use formats
proc format;
value datef low-'31DEC2005'd = 'Up to 2005'
'01JAN2006'd-'31DEC2010'd = '2006-2010'
... /* You do the rest of the typing */
;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.