Help with grouping by date range.
I have the following code in a Data Step and PROC SQL; however, my dates are only assigned to the "Other" category. I cannot figure out the problem. Any help is greatly appreciated.
Data grouping;
set lalic;
length Group $15;
if cover_start_dt < '01JAN2009'd then Group= "Less Than";
else if cover_start_dt >='01JAN2009'D and cover_start_dt <='31DEC2009'D then Group = "2009";
else if cover_start_dt >='01JAN2010'D and cover_start_dt <='31DEC2010'D then Group = "2010";
else if cover_start_dt >='01JAN2011'D and cover_start_dt <='31DEC2011'D then Group = "2011";
else if cover_start_dt >='01JAN2012'D and cover_start_dt <='31DEC2012'D then Group = "2012";
else if cover_start_dt >='01JAN2013'D and cover_start_dt <='31DEC2013'D then Group = "2013";
else if cover_start_dt >='01JAN2014'D and cover_start_dt <='31DEC2014'D then Group = "2014";
else Group = "Other";
keep lalic_ID Admit_dt Cover_start_dt Cover_end_dt zip_cd importFile_ID Group;
run;
proc sql;
create table lalic2 as
select
t2.encounter_id,
t2.medrec_no,
datepart (t2.admit_dt)as Admission_dt format=mmddyy10.,
datepart (t2.cover_start_dt) as Start_dt format=mmddyy10.,
datepart (t2.cover_end_dt) as End_dt format=mmddyy10.,
t2.gender,
t2.age,
t2.race
case
when (t2.cover_start_dt < '01JAN2009'D) then "Less Than"
when (t2.cover_start_dt >='01JAN2009'D and t2.cover_start_dt <='31DEC2009'D) then "2009"
when (t2.cover_start_dt >='01JAN2010'D and t2.cover_start_dt <='31DEC2010'D) then "2010"
when (t2.cover_start_dt >='01JAN2011'D and t2.cover_start_dt <='31DEC2011'D) then "2011"
when (t2.cover_start_dt >='01JAN2012'D and t2.cover_start_dt <='31DEC2012'D) then "2012"
when (t2.cover_start_dt >='01JAN2013'D and t2.cover_start_dt <='31DEC2013'D) then "2013"
when (t2.cover_start_dt >='01JAN2014'D and t2.cover_start_dt <='31DEC2014'D) then "2014"
else "Other"
end
as Dategroup
from sqldb.dbo_lalic t2;
quit;
You are correct. Below is the exact copy of the code I used with the Group= included. I was able to run a proc freq to give me what I needed. I tried to use the year but I needed a specific date range. When I used "year" only, it did not assign the data to the groups correctly.
Data grouping;
set lalic;
length Group $15;
Date=datepart(cover_start_dt);
if date >= '01JAN2009'd and date <= '31DEC2009'D then
Group="2009";
else if date >='01JAN2010'D and date <='31DEC2010'D then
Group= "2010";
else if date >='01JAN2011'D and date <='31DEC2011'D then
Group= "2011";
else if date >='01JAN2012'D and date <='31DEC2012'D then
Group= "2012";
else if date >='01JAN2013'D and date <='31DEC2013'D then
Group= "2013";
else if date >='01JAN2014'D and date <='31DEC2014'D then
Group= "2014";
else Group='Other';
keep Lalic_ID Admit_dt Cover_start_dt Cover_end_dt zip_cd importFile_ID Group;
proc freq data= grouping nlevels;
table group /out=Groupdate outcum nopercent;
run;
I have a feeling that your 'cover_start_dt" is a datetime variable instead. I have no idea on the impact of efficiency, but I will just do the following to avoid typing much:
Data grouping;
set lalic;
length Group $15;
_n_=year(datepart(cover_start_dt));
if _n_ <2009, then Group= "Less Than";
else if _n_ >2014 then Group='Others';
else group=put(_n_,4.);
run;
Message was edited by: haikuo bian
You are correct. Cover_start_dt is a datetime variable. I will definitely give your code a try and let you know how it works.
Hi Hai.Kuo,
The code worked to group 2009 but not for 2104 because I do not have any dates >2014. I need date ranges between 1/1/2014 to 12/31/2014.
Thanks,
Kolynda
Thank you very much for your help! The following code worked:
Data grouping;
set lalic;
length Group $15;
if a < '01JAN2009'd then "Less Than";
else if a >='01JAN2010'D and a <='31DEC2010'D then "2010";
else if a >='01JAN2011'D and a <='31DEC2011'D then "2011";
else if a >='01JAN2012'D and a <='31DEC2012'D then "2012";
else if a >='01JAN2013'D and a <='31DEC2013'D then "2013";
else if a >='01JAN2014'D and a <='31DEC2014'D then "2014";
else Group='Others';
run;
That isn't valid SAS code, you're missing Group= from each of the if statements.
You can also use a smaller set of code that is a bit more dynamic:
year=year(datepart(a));
if year<2009 then group="Less Than";
else if year>2014 then group="Others";
else group=put(year, 4. -l);
If your ranges change in the future you can easily change the boundaries.
You are correct. Below is the exact copy of the code I used with the Group= included. I was able to run a proc freq to give me what I needed. I tried to use the year but I needed a specific date range. When I used "year" only, it did not assign the data to the groups correctly.
Data grouping;
set lalic;
length Group $15;
Date=datepart(cover_start_dt);
if date >= '01JAN2009'd and date <= '31DEC2009'D then
Group="2009";
else if date >='01JAN2010'D and date <='31DEC2010'D then
Group= "2010";
else if date >='01JAN2011'D and date <='31DEC2011'D then
Group= "2011";
else if date >='01JAN2012'D and date <='31DEC2012'D then
Group= "2012";
else if date >='01JAN2013'D and date <='31DEC2013'D then
Group= "2013";
else if date >='01JAN2014'D and date <='31DEC2014'D then
Group= "2014";
else Group='Other';
keep Lalic_ID Admit_dt Cover_start_dt Cover_end_dt zip_cd importFile_ID Group;
proc freq data= grouping nlevels;
table group /out=Groupdate outcum nopercent;
run;
Also, when I am copying and pasting my code into the Editor it is truncating my code or putting in paragraph form. I am now mindful to ensure my code is correct before replying. Thanks again.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.