DATA Step, Macro, Functions and more

How to create age cohorts from a testdate and birth year

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

How to create age cohorts from a testdate and birth year

I am working with a data set where I only have the variables "BIRTHYEAR" in year formatted best32. and "TESTDATE" formatted yymmdd10.

 

I need to describe the data based on age and would like to create age cohorts. However, since the data captures 15 years of testing, the only way to accurately describe age would be the test date year minus the birth year. How would I code this?

 

Thanks!


Accepted Solutions
Solution
‎09-19-2016 04:58 PM
Respected Advisor
Posts: 4,920

Re: How to create age cohorts from a testdate and birth year

Posted in reply to mphqueens

Minimize the age error with:

 

age = yrdif(mdy(7, 1, birthyear), testdate, "AGE");

 

it should be exact about half the time and never be off by more than one year.

PG

View solution in original post


All Replies
Occasional Contributor
Posts: 15

How to create age cohorts from a testdate and birth year

Posted in reply to mphqueens

Hey all,

 

I am working with a data set where I only have the variables "BIRTHYEAR" in year formatted best32. and "TESTDATE" formatted yymmdd10.

 

I need to describe the data based on age and would like to create age cohorts. However, since the data captures 15 years of testing, the only way to accurately describe age would be the test date year minus the birth year. How would I code this?

 

Thanks!

Super User
Posts: 11,343

Re: How to create age cohorts from a testdate and birth year

Posted in reply to mphqueens

Not saying this is the best but would give an upper bound on calendar age

 

data want;
   set have;
   age = intck('year',mdy(1,1,Birthyear),testdate);
run;
Super User
Posts: 19,772

Re: How to create age cohorts from a testdate and birth year

Posted in reply to mphqueens

 

I would simply recommend using the year as the variable and groupoing according to that.

 

Also, you may want to look at some demographic methods, there are corrections for when you approximate age as a continuous variable to be an interval variable, depending on what type of statistics you're trying to measure.

 

 

Solution
‎09-19-2016 04:58 PM
Respected Advisor
Posts: 4,920

Re: How to create age cohorts from a testdate and birth year

Posted in reply to mphqueens

Minimize the age error with:

 

age = yrdif(mdy(7, 1, birthyear), testdate, "AGE");

 

it should be exact about half the time and never be off by more than one year.

PG
Occasional Contributor
Posts: 15

Re: How to create age cohorts from a testdate and birth year

Thank you! When I run these, I get some values such as age = -4. I believe that some of the lines do not have test date or birth year values inputted. How would I deal with this?

Respected Advisor
Posts: 4,920

Re: How to create age cohorts from a testdate and birth year

Posted in reply to mphqueens

missing birth years or test dates should generate missing ages. Can't say more without seeing data.

PG
Super User
Posts: 11,343

Re: How to create age cohorts from a testdate and birth year

[ Edited ]
Posted in reply to mphqueens

mphqueens wrote:

Thank you! When I run these, I get some values such as age = -4. I believe that some of the lines do not have test date or birth year values inputted. How would I deal with this?


 

Likely to be a data entry issue. If the birth year is after the data of a test there is something fishy.

 

Data problem;

   set have;

   where birthyear > year(testdate);

run;

 

You may want to drop non-identification variables above just to look at problem records.

 

I have some inherited data from 1994 that still has "test dates" in the future. There are 6 records that the tests will occur in 2021 or 2020. We also people getting tested as early as the 11th century (only 979 years before birth) and several hundred where the test was before birth by varying amounts.

 

So check your raw data, see if some of the values can be corrected and prepare to remove some from your analysis set.

One of the first things I do with any dataset with dates is to see if they may sense. Some additional checks besides the relational birth should be before test are: Birth year in expected range? Test date in explected range (should not be greater than today unless it is "scheduled to be tested" date. You may check to see if any of your test dates are missing due to conversion problems. If you read a text file with an improbable date: day 31 in month with only 30 days, leap day not in leap year, Feb 30 or 31, year prior to 1581 or post 19999 (largest my data sources have thrown to me was 19983) and such. If truly paranoid you might verify that the day of the week for the test date makes sense or did not occur on a holiday if that should not happen.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 412 views
  • 1 like
  • 4 in conversation