Help with grouping by date range.
I have the following code in a Data Step and PROC SQL; however, my dates are only assigned to the "Other" category. I cannot figure out the problem. Any help is greatly appreciated.
Data grouping;
set lalic;
length Group $15;
if cover_start_dt < '01JAN2009'd then Group= "Less Than";
else if cover_start_dt >='01JAN2009'D and cover_start_dt <='31DEC2009'D then Group = "2009";
else if cover_start_dt >='01JAN2010'D and cover_start_dt <='31DEC2010'D then Group = "2010";
else if cover_start_dt >='01JAN2011'D and cover_start_dt <='31DEC2011'D then Group = "2011";
else if cover_start_dt >='01JAN2012'D and cover_start_dt <='31DEC2012'D then Group = "2012";
else if cover_start_dt >='01JAN2013'D and cover_start_dt <='31DEC2013'D then Group = "2013";
else if cover_start_dt >='01JAN2014'D and cover_start_dt <='31DEC2014'D then Group = "2014";
else Group = "Other";
keep lalic_ID Admit_dt Cover_start_dt Cover_end_dt zip_cd importFile_ID Group;
run;
proc sql;
create table lalic2 as
select
t2.encounter_id,
t2.medrec_no,
datepart (t2.admit_dt)as Admission_dt format=mmddyy10.,
datepart (t2.cover_start_dt) as Start_dt format=mmddyy10.,
datepart (t2.cover_end_dt) as End_dt format=mmddyy10.,
t2.gender,
t2.age,
t2.race
case
when (t2.cover_start_dt < '01JAN2009'D) then "Less Than"
when (t2.cover_start_dt >='01JAN2009'D and t2.cover_start_dt <='31DEC2009'D) then "2009"
when (t2.cover_start_dt >='01JAN2010'D and t2.cover_start_dt <='31DEC2010'D) then "2010"
when (t2.cover_start_dt >='01JAN2011'D and t2.cover_start_dt <='31DEC2011'D) then "2011"
when (t2.cover_start_dt >='01JAN2012'D and t2.cover_start_dt <='31DEC2012'D) then "2012"
when (t2.cover_start_dt >='01JAN2013'D and t2.cover_start_dt <='31DEC2013'D) then "2013"
when (t2.cover_start_dt >='01JAN2014'D and t2.cover_start_dt <='31DEC2014'D) then "2014"
else "Other"
end
as Dategroup
from sqldb.dbo_lalic t2;
quit;
You are correct. Below is the exact copy of the code I used with the Group= included. I was able to run a proc freq to give me what I needed. I tried to use the year but I needed a specific date range. When I used "year" only, it did not assign the data to the groups correctly.
Data grouping;
set lalic;
length Group $15;
Date=datepart(cover_start_dt);
if date >= '01JAN2009'd and date <= '31DEC2009'D then
Group="2009";
else if date >='01JAN2010'D and date <='31DEC2010'D then
Group= "2010";
else if date >='01JAN2011'D and date <='31DEC2011'D then
Group= "2011";
else if date >='01JAN2012'D and date <='31DEC2012'D then
Group= "2012";
else if date >='01JAN2013'D and date <='31DEC2013'D then
Group= "2013";
else if date >='01JAN2014'D and date <='31DEC2014'D then
Group= "2014";
else Group='Other';
keep Lalic_ID Admit_dt Cover_start_dt Cover_end_dt zip_cd importFile_ID Group;
proc freq data= grouping nlevels;
table group /out=Groupdate outcum nopercent;
run;
I have a feeling that your 'cover_start_dt" is a datetime variable instead. I have no idea on the impact of efficiency, but I will just do the following to avoid typing much:
Data grouping;
set lalic;
length Group $15;
_n_=year(datepart(cover_start_dt));
if _n_ <2009, then Group= "Less Than";
else if _n_ >2014 then Group='Others';
else group=put(_n_,4.);
run;
Message was edited by: haikuo bian
You are correct. Cover_start_dt is a datetime variable. I will definitely give your code a try and let you know how it works.
Hi Hai.Kuo,
The code worked to group 2009 but not for 2104 because I do not have any dates >2014. I need date ranges between 1/1/2014 to 12/31/2014.
Thanks,
Kolynda
Thank you very much for your help! The following code worked:
Data grouping;
set lalic;
length Group $15;
if a < '01JAN2009'd then "Less Than";
else if a >='01JAN2010'D and a <='31DEC2010'D then "2010";
else if a >='01JAN2011'D and a <='31DEC2011'D then "2011";
else if a >='01JAN2012'D and a <='31DEC2012'D then "2012";
else if a >='01JAN2013'D and a <='31DEC2013'D then "2013";
else if a >='01JAN2014'D and a <='31DEC2014'D then "2014";
else Group='Others';
run;
That isn't valid SAS code, you're missing Group= from each of the if statements.
You can also use a smaller set of code that is a bit more dynamic:
year=year(datepart(a));
if year<2009 then group="Less Than";
else if year>2014 then group="Others";
else group=put(year, 4. -l);
If your ranges change in the future you can easily change the boundaries.
You are correct. Below is the exact copy of the code I used with the Group= included. I was able to run a proc freq to give me what I needed. I tried to use the year but I needed a specific date range. When I used "year" only, it did not assign the data to the groups correctly.
Data grouping;
set lalic;
length Group $15;
Date=datepart(cover_start_dt);
if date >= '01JAN2009'd and date <= '31DEC2009'D then
Group="2009";
else if date >='01JAN2010'D and date <='31DEC2010'D then
Group= "2010";
else if date >='01JAN2011'D and date <='31DEC2011'D then
Group= "2011";
else if date >='01JAN2012'D and date <='31DEC2012'D then
Group= "2012";
else if date >='01JAN2013'D and date <='31DEC2013'D then
Group= "2013";
else if date >='01JAN2014'D and date <='31DEC2014'D then
Group= "2014";
else Group='Other';
keep Lalic_ID Admit_dt Cover_start_dt Cover_end_dt zip_cd importFile_ID Group;
proc freq data= grouping nlevels;
table group /out=Groupdate outcum nopercent;
run;
Also, when I am copying and pasting my code into the Editor it is truncating my code or putting in paragraph form. I am now mindful to ensure my code is correct before replying. Thanks again.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.