BookmarkSubscribeRSS Feed
Jeff_DOC
Pyrite | Level 9

Hello there.

 

I am trying to determine age over many, many records. I'm using the function: int(yrdif(dob, date_01, 'AGE')). 

 

Both DOB and Date_01 are formatted MMDDYY10. However, I get no results for some records. The only pattern I can see is that the no result records are for people born prior to January 1, 1960. I'm assuming this is because prior to that the result is negative but that's only an assumption. I've tried the option yearcutoff=1910 but that doesn't seem to help. Can anyone tell me if there's a better way to determine age in the situation where some records are pre and some are post?  

 

Thank you very much.

11 REPLIES 11
PaigeMiller
Diamond | Level 26

I don't see a problem in this very small test:

 

data abc;
    input (dob date_01) (yymmdd10.);
    age=int(yrdif(dob,date_01,'AGE'));
datalines;
1955-01-01 2025-08-15
;

 

Format of the variables is irrelevant. Negative dates are irrelevant. If the dates are truly dates, then YEARCUTOFF= should be irrelevant.

 

Please provide the full data step or full SQL code, not a fraction of a line of code as in your original message. Also, if there are WARNINGS or ERRORs in the log, please show us the log for this data step or SQL (and not just the lines with the WARNINGs or ERRORs, we need to see the full log for this data step or SQL.)

 

Please provide a small sample of data that includes dates before and after 01JAN1960. Best would be if the dates are UNformatted (yes UNFORMATTED). Please provide the data as DATA step code, according to these examples and instructions as I have done above;  or I think this short program will work as well in this situation on a small subset of the data with dates before and after 01JAN1960. Do not provide the data via COPY/PASTE from Excel, do not provide the data as an Excel file.

 

proc print data=abc; /* Use your real data set name here */
    var dob date_01 age;
    format dob date_01; /* Unformatted */
run;

 

--
Paige Miller
Cynthia_sas
Diamond | Level 26

Hi:

  I'm not sure what your date_01 variable is. However, testing on age based on today's date, I get these results:

Cynthia_sas_0-1757007966676.jpeg

As you can see/subtract yourself, the ages are all correct given the dates both before and after Jan 1 1960.

I used this code:

Cynthia_sas_1-1757008126888.jpeg

 

I suspect that yearcutoff is not your issue because with a 4 digit year, you should not have any issues. And your use of mmddyy10 for the format  implies that you have a 4 digit year.

Cynthia

Quentin
Super User

Check if your log says "Note: missing values were calculated at... "  That would suggest that either DOB or Date_01 are missing values for some records.  Many people treat that missing values note as an error message.

Jeff_DOC
Pyrite | Level 9

Thanks to all who are contributing.

 

You can see by the attached data example why I'm thinking this is a 1/1/1960 issue. I know the log states a divide by zero option but both dates do exist for those records without an age showing even in the log warning. I believe the warning most likely coming from a unrelated function in the same data step since the step has other included work. The missing data begins prior to 1/1/1960 (date zero) and picks up the first date after 1/1/1960 even though the two dates clearly exist.

 

 

dob=12/31/1959 
date_01=1/11/2024
age_on_date=.

 

Data temp;
	Set temp_01;
    age_on_date = int(yrdif(dob,  date_01,  'AGE'));
run;

NOTE: Division by zero detected at line 229 column 32.
dob date_01 age_on_date
12/31/1959 1/11/2024 .
12/31/1959 1/11/2024 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 4/28/1987 .
1/1/1960 9/28/2021 .
1/1/1960 9/28/2021 .
1/1/1960 9/28/2021 .
1/1/1960 9/28/2021 .
1/1/1960 9/28/2021 .
1/1/1960 9/28/2021 .
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/9/1960 6/5/2007 47
1/10/1960 12/18/2012 52
1/10/1960 12/18/2012 52
1/10/1960 12/18/2012 52
1/10/1960 12/18/2012 52
1/10/1960 12/18/2012 52
PaigeMiller
Diamond | Level 26

Can we see the unformatted values, as I suggested, by using the PROC PRINT code I provided?

 

Problems like this with dates not doing what is expected are sometimes difficult to debug. This is why I am asking for specific information, we need you to provide the specific information I asked for, and not the information you decide is good enough.

 

Also, I cannot imagine where Division by 0 comes in here unless there's more to the code that you are not showing us. Can you show us the ENTIRE log for this data set (not just the warnings or errors) as I requested?

 

I do not get missings.

 

data abc;
    input (dob date_01) (yymmdd10.);
    age=int(yrdif(dob,date_01,'AGE'));
    format dob date_01 mmddyy10.;
datalines;
1959-12-31 2024-01-11
;

proc print data=abc;
    var dob date_01 age;
    format dob date_01;
run;

 

--
Paige Miller
Jeff_DOC
Pyrite | Level 9

Certainly.

dob date_01 age_on_date
-1 23386 .
-1 23386 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
9 19345 52
9 19345 52
9 19345 52
9 19345 52
9 19345 52
9 19345 52

 

Tom
Super User Tom
Super User

Those values also work.

Spoiler
data have;
  input dob date_01 wrong ;
cards;
-1 23386 .
-1 23386 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
0 22551 .
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
8 17322 47
9 19345 52
9 19345 52
9 19345 52
9 19345 52
9 19345 52
9 19345 52
;
proc sort nodupkey;
  by _all_;
run;

data want;
  set have;
  format dob date_01 mmddyy10.;
  right = int(yrdif(dob,  date_01,  'AGE'));
run;

proc print;
run;

Results

Obs           dob       date_01    wrong    right

 1     12/31/1959    01/11/2024       .       64
 2     01/01/1960    09/28/2021       .       61
 3     01/09/1960    06/05/2007      47       47
 4     01/10/1960    12/18/2012      52       52

Check all of the steps in your log.  Make sure they completed and updated the output dataset. Make sure you are actually looking at the updated dataset.

 

PaigeMiller
Diamond | Level 26

@Jeff_DOC 

I requested the unformatted data. Thank you. I also requested the ENTIRE log for this data step (not just the errors or warnings, but every line in the log for this data step), which I still need.

--
Paige Miller
Cynthia_sas
Diamond | Level 26

Hi:

I agree with @PaigeMiller . There's something else happening with your data or your code. Just using some of your data values, I still get the correct values calculated -- even with a date in 1884 thrown in just for additional proof:

Cynthia_sas_0-1757015912326.jpeg

  I wonder what is in your original data that is causing the problem. As demonstrated with multiple examples, the code should generate a correct results, assuming that the date values used by your functions are actual SAS date values.

Cynthia

Tom
Super User Tom
Super User

Something else is the problem. 

Those dates work fine.

data have; 
  input dob :mmddyy. date_01 :mmddyy. wrong ;
  right = int(yrdif(dob,  date_01,  'AGE'));
  format dob date_01 yymmdd10.;
cards;
1/1/1960   4/28/1987   .
1/1/1960   9/28/2021   .
1/10/1960  12/18/2012 52
1/9/1960   6/5/2007   47
12/31/1959 1/11/2024   .
;

Results

Obs           dob       date_01    wrong    right

 1     1960-01-01    1987-04-28       .       27
 2     1960-01-01    2021-09-28       .       61
 3     1960-01-10    2012-12-18      52       52
 4     1960-01-09    2007-06-05      47       47
 5     1959-12-31    2024-01-11       .       64

Most likely the values in the listing you posted are CHARACTER strings that you tried to CONVERT into dates.  Check that step for the source of the errors. Most likely it either read the strings using the wrong informat (or wrongly placed so that it read only some of the string or parts of the strings next to the date).  Or perhaps there were tabs or other invisible characters in the strings that cause the conversion to fail.

 

Note that if they were DATE values printed with the MMDDYY10. format they would look like this instead:

Obs           dob       date_01    wrong    right

 1     01/01/1960    04/28/1987       .       27
 2     01/01/1960    09/28/2021       .       61
 3     01/10/1960    12/18/2012      52       52
 4     01/09/1960    06/05/2007      47       47
 5     12/31/1959    01/11/2024       .       64

Notice how this listing is showing the leading zeros in single digit days and months that was missing in the text strings you posted.

 

Quentin
Super User

Can you replicate the problem with the little step you showed?   Something like:

 

data want ;
  set temp_01 ;
  age_on_date = int(yrdif(dob,  date_01,  'AGE')) ;

  if missing(age_on_date) then put dob= date_01= ;
  format dob date_01 ;
run;

If so, then please post the full log from running that step.  Well maybe, not the full log if it's 1000s of lines from the PUT statement....

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 194 views
  • 0 likes
  • 5 in conversation