Hi,
I am attempting to calculate the age of individuals in my data set using their date of birth and the date their clinical treatment started. Following that I will be attempting to create age groups according to their age at their addmitance.
Please note that this data is over a period of time, and individuals can have been admitted on multiple occasions.
An example of the data:
clientid dob clstartdate
123 10435 18309
122 3982 19452
156 -13125 20065
189 11497 17875
192 10977 19912
201 3133 20391
How would I go about calculating their age?
Use the intck() function:
data have;
input clientid dob clstartdate;
cards;
123 10435 18309
122 3982 19452
156 -13125 20065
189 11497 17875
192 10977 19912
201 3133 20391
;
run;
data want;
set have;
age_at_clt = intck('year',dob,clstartdate);
run;
Or you could use a basic calculation:
age_at_clt = (clstartdate - dob) / 365.25;
Or, for special types of "age":
age_at_clt = year(clstartdate) - year(dob);
So it's up to you to define what "age" is supposed to be in your context.
Use the intck() function:
data have;
input clientid dob clstartdate;
cards;
123 10435 18309
122 3982 19452
156 -13125 20065
189 11497 17875
192 10977 19912
201 3133 20391
;
run;
data want;
set have;
age_at_clt = intck('year',dob,clstartdate);
run;
Or you could use a basic calculation:
age_at_clt = (clstartdate - dob) / 365.25;
Or, for special types of "age":
age_at_clt = year(clstartdate) - year(dob);
So it's up to you to define what "age" is supposed to be in your context.
Those calculations you give Kurt, are they accurate, I know the 365.25 is just an approximation, but how about the others? I know for a long time we used:
int((intck('month',<birth>,<start>)-(day(<birth>)> day(<start>))) /12)
To get the most accurate age reading, but maybe the functions like year now cover this?
Those three calculations were just the proverbial "tip of the iceberg".
There are so many ways to go about it, and so much more ways to fine-tune those, that it will be up to the developer to determine which one to use in the end.
It comes down to try those methods and compare their results with the desired outcome, over a sufficient lot of test cases. Just defining the test cases can be a major challenge (and is often the most important part of developing, I'm a big supporter of test-driven development).
The 365.25 approximation should work reasonably well for timespans in the range of lifetimes. It depends if a "legal" age value is needed, or a value best suited for statistics.
I would like to use
age_at_clt = yrdif(dob,clstartdate,'age');
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.