Hi,
I have a dataset in which there is a variable (diagnosis_dt) for which there are values in the DDMONYYYY format. I want to create a categorical variable out of this data, and I want to make 3 categories: <= 1 year, 2-7 years, and >= 8 years.
My original set is EA1_1.
Here is what I have tried. I admittedly have no idea what I am doing. When I input this, all of my outputs end up being 2-7 years.
data EA1_2;
set EA1_1;
length TimeSinceDiagnosis $3.;
IF diagnosis_dt > = '31DEC2012'd THEN TimeSinceDiagnosis ='<1yr';
ELSE IF diagnosis_dt < = '31DEC2012'd or diagnosis_dt > = '31DEC2005'd THEN TimeSinceDiagnosis = '2-7yrs';
ELSE IF diagnosis_dt < = '01JAN2005'd THEN TimeSinceDiagnosis = '> 8yrs';
run;
1) Specify a greater length to TimeSinceDiagnosis.
2) All the diagnosis dates I can see are before 31DEC2012. So the first else-if statement is always fulfilled, since you specify an OR condition and not an AND condition. That way, you never get to the >8Years part. Try the code below.
data EA1_2;
set EA1_1;
length TimeSinceDiagnosis $20;
IF diagnosis_dt > = '31DEC2012'd THEN
TimeSinceDiagnosis ='<1yr';
ELSE IF diagnosis_dt < = '31DEC2012'd and diagnosis_dt > = '31DEC2005'd THEN
TimeSinceDiagnosis = '2-7yrs';
ELSE IF diagnosis_dt < = '01JAN2005'd THEN
TimeSinceDiagnosis = '> 8yrs';
run;
Can you show us what your data looks like?
Here is a screenshot of what I am looking at!
1) Specify a greater length to TimeSinceDiagnosis.
2) All the diagnosis dates I can see are before 31DEC2012. So the first else-if statement is always fulfilled, since you specify an OR condition and not an AND condition. That way, you never get to the >8Years part. Try the code below.
data EA1_2;
set EA1_1;
length TimeSinceDiagnosis $20;
IF diagnosis_dt > = '31DEC2012'd THEN
TimeSinceDiagnosis ='<1yr';
ELSE IF diagnosis_dt < = '31DEC2012'd and diagnosis_dt > = '31DEC2005'd THEN
TimeSinceDiagnosis = '2-7yrs';
ELSE IF diagnosis_dt < = '01JAN2005'd THEN
TimeSinceDiagnosis = '> 8yrs';
run;
Anytime 🙂
Your mistake is here
ELSE IF diagnosis_dt < = '31DEC2012'd or diagnosis_dt > = '31DEC2005'd
it should be
ELSE IF diagnosis_dt < = '31DEC2012'd and diagnosis_dt > = '31DEC2005'd
You can streamline your code to make it easier to use in the future:
%let now='01jul2013'd; /* just one date in 2013 */
/* this is all you have to set in the future */
/* the following calculates the cutoff dates from the initial value */
%let cut1 = %sysfunc(intnx(year,&now.,-1,e));
%let cut7 = %sysfunc(intnx(year,&now.,-8,e));
/* the following is for informational purposes only, you can remove it later */
data _null_;
cut1 = &cut1;
cut7 = &cut7;
put cut1= yymmdd10.;
put cut7= yymmdd10.;
run;
/* this is your streamlined code */
data EA1_2;
set EA1_1;
length TimeSinceDiagnosis $7.;
if diagnosis_dt > &cut1. then TimeSinceDiagnosis = '<1yr';
else if diagnosis_dt > &cut7. then TimeSinceDiagnosis = '2-7yrs';
else TimeSinceDiagnosis = '>=8yrs';
run;
/* note that the else-if makes some conditions redundant */
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.