I have a dataset that just has DOB, which is formatted 01JAN2022. I am trying to find a way to ultimately create 8 different age groups:
1. 0-4 years
2. 5-11 years
3. 12-17 years
4. 18-29 years
5. 30-49 years
6. 50-64 years
7. 65-79 years
8. 80+ years
I was playing around with proc format but I believe I need to convert DOB to age before I format into the age groups. Below is what I have so far:
data b;
set a.linked_pairsxx;
run;
proc format;
value agegroup
0-4 = '<5'
5-11 = '5-11'
12-17 = '12-17'
18-29 = '18-29'
30-49 = '30-49'
50-64 = '50-64'
65-79 = '65-79'
80-high = '80+';
run;
Age to today is calculated via the YRDIF function.
age = floor(yrdif(dob,today()));
The floor function turns age into an integer.
Then age group would be a custom format.
proc format;
value agef 0-4='0-4 Years'
5-11 = '5-11 years'
12-17 = '12-17 years'
/* I'm lazy, you type the rest */
;
and then you apply the custom format agef. to your calculated age.
The whole thing would look like this
proc format;
value agef 0-4='0-4 Years'
5-11 = '5-11 years'
12-17 = '12-17 years'
/* I'm lazy, you type the rest */
;
data want;
set have;
age = floor(yrdif(dob,today()));
format age agef.;
run;
To calculate any AGE you need the DOB and the date that you want the age on. Age can change on any day of the year so is may not be a trivial concept depending on your data. If you have something like a "date of service/activity/purchase" then that would be a likely date to calculate the age. If not you have have to decide what to use.
In a data step one way, IF your date values are actually SAS date values would be
Age = floor( yrdif(DOB, otherdatevariable,'AGE') ) ;
If you do not have another date variable and want to calculate age at a specific date place that in the second position such as '01JAN2022'd the quotes and D tell SAS to treat that as the date value. DO NOT USE any other date appearance, it has to be in the SAS DATEw. format appearance.
The YRDIF function will return a number of years from the first date to the second date but will have a fractional year component, so FLOOR would be used to truncate the decimal portion if you want an integer (normal appearing) age.
Why is using floor(yrdif(...)) recommended, i always used age = intck('year', DOB, Ref_Date, 'C') ?
Age to today is calculated via the YRDIF function.
age = floor(yrdif(dob,today()));
The floor function turns age into an integer.
Then age group would be a custom format.
proc format;
value agef 0-4='0-4 Years'
5-11 = '5-11 years'
12-17 = '12-17 years'
/* I'm lazy, you type the rest */
;
and then you apply the custom format agef. to your calculated age.
The whole thing would look like this
proc format;
value agef 0-4='0-4 Years'
5-11 = '5-11 years'
12-17 = '12-17 years'
/* I'm lazy, you type the rest */
;
data want;
set have;
age = floor(yrdif(dob,today()));
format age agef.;
run;
As well as deciding what reference date you want age calculated from, do you want age as an integer or decimal? If you are also going to do statistics on age like averages, minimums, maximums, standard deviations and so on you would normally use decimal ages.
you want age as of when? today? first diagnosis date? I usually do age=int(date-dob/365.25);
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.