BookmarkSubscribeRSS Feed
sasphd
Lapis Lazuli | Level 10

Hello,

I want to obtain date difference in year between two dates in YYMMDDN8. format

I write this code but it result in an empty Colum for age 

Age = yrdif(FIRST_OFFER_DT, caldt, 'AGEYear');

 

13 REPLIES 13
Quentin
Super User

If you run PROC CONTENTS on the dataset, are FIRST_OFFER_DT and CALDT numeric or character?

 

If they are character, you will need to use the INPUT function to convert them into SAS dates before doing any calculations.

 

For difference in years between two dates, you'll also think about how you want to define a year.  See the BASIS option to YRDIF, or look at INTCK().

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Quentin
Super User

Can you post the log from running your code?  'AGEyear' is not a valid value for the third parameter, perhaps you want 'AGE'.

 

As is, you should see bad messages in the log:

NOTE: Invalid argument to function YRDIF...

NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
sasphd
Lapis Lazuli | Level 10

NOTE: Invalid argument to function YRDIF(9104,9190,'AGEYear') at ligne 78 colonne 7.
Rank_NO=1 CRSP_FUNDNO=000053 MRET=1.000624 CALDT=19850228 FIRST_OFFER_DT=19841204 Age=. _ERROR_=1 _N_=
Kurt_Bremser
Super User

If your "date" is actually the number 

19850228

(something close to 2 million), then this is way beyond the range of dates the SAS functions are meant to deal with. You need to first convert such values to real SAS dates:

caldt = input(put(caldt,z8.),yymmdd8.);
format caldt yymmdd10.;
sasphd
Lapis Lazuli | Level 10


53   data JOINT_FO1;
54   set JOINT_FO;
55   caldt = input(put(caldt,z8.),yymmdd8.);
56   format caldt yymmdd10.;
57   run;

NOTE: Invalid argument to function INPUT at ligne 55 colonne 9.
Rank_NO=1 CRSP_FUNDNO=000053 MRET=1.024799 CALDT=. FIRST_OFFER_DT=19841204 _ERROR_=1 _N_=31
Kurt_Bremser
Super User

Since it first happens in observation 31, it lets me suspect the previous dates were convertible. What does PROC CONTENTS say about the date variables you start with?

sasphd
Lapis Lazuli | Level 10

this is th output of proc contents

 

   
4 CALDT Num. 8 YYMMDD10. YYMMDD8. Date
2 CRSP_FUNDNO Num. 8 Z6. 6. Fund Identifier
5 FIRST_OFFER_DT Num. 8 YYMMDDN8.   Date the Fund Was First Offered (from header data)
3 MRET Num. 8 9.6 9.6 Total Return per Share as of Month End
1 Rank_NO Num. 8      
Kurt_Bremser
Super User

This means no conversion is needed; you should apply the YYMMDD10. format to all your date variables to avoid confusion.

Both the YRDIF and INTCK functions should work, as long as the parameters are valid. Missing input values will result in missing output values.

Quentin
Super User

It looks like you have valid SAS date values, so that's good news.  I think the only problem is that you used 'AGEyear' as the third argument rather than 'Age'.  When I run with your values:

data _null_ ;
  FIRST_OFFER_DT="04Dec1984"d ;
  CALDT="28Feb1985"d ;

  Age = yrdif(FIRST_OFFER_DT, CALDT, 'AGE');

  put (FIRST_OFFER_DT CALDT AGE)(=) ;
  format CALDT FIRST_OFFER_DT yymmddn8. ;
run ;

I get:

1    data _null_ ;
2      FIRST_OFFER_DT="04Dec1984"d ;
3      CALDT="28Feb1985"d ;
4
5      Age = yrdif(FIRST_OFFER_DT, CALDT, 'AGE');
6
7      put (FIRST_OFFER_DT CALDT AGE)(=) ;
8      format CALDT FIRST_OFFER_DT yymmddn8. ;
9    run ;

FIRST_OFFER_DT=19841204 CALDT=19850228 Age=0.2356164384

 

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
sasphd
Lapis Lazuli | Level 10
the problem is that this program generate a lot of message in the log

FIRST_OFFER_DT=19830101 CALDT=. Age=.
FIRST_OFFER_DT=19830101 CALDT=. Age=.
FIRST_OFFER_DT=19830101 CALDT=. Age=.
FIRST_OFFER_DT=19830101 CALDT=. Age=.
FIRST_OFFER_DT=19830101 CALDT=. Age=.
FIRST_OFFER_DT=19830101 CALDT=. Age=.
FIRST_OFFER_DT=19830101 CALDT=. Age=.
Tom
Super User Tom
Super User

You cannot calculate an age from missing values.

Why not just only attempt it when you have both date values?

 

Quentin
Super User

You have records in your data where CALDT is missing.  If you expect this to happen, and want to have AGE be missing when this happens, you can make the calculation of AGE conditional on both FIRST_OFFER_DT and CALDT being non-missing, like:

if n(FIRST_OFFER_DT, CALDT)=2 then Age = yrdif(FIRST_OFFER_DT, CALDT, 'AGE');

Or you could impute a value for CALDT when it is missing.

But if you're surprised that CALDT is sometimes missing, then you should go back in your program  / data earlier to figure out why CALDT is sometimes missing.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

It does not matter what format you have attached to the variable to control how it prints.  But it does need to have actual DATE values.

You need to use the AGE option, which will calculate age in years, since there is no AGEYEAR option.

845  data test;
846    first_offer_dt = '01JAN2019'd;
847    caldt = today();
848    Age = yrdif(FIRST_OFFER_DT, caldt, 'AGE');
849    put (_all_) (=/);
850    put (first_offer_dt caldt) (=date9.);
851    put (first_offer_dt caldt) (=yymmdd10.);
852  run;


first_offer_dt=21550
caldt=23240
Age=4.6273972603
first_offer_dt=01JAN2019 caldt=18AUG2023
first_offer_dt=2019-01-01 caldt=2023-08-18

If you want to eliminate the fractional part of the result you can use INT() to get normal age result. Or perhaps use ROUND() if you want something different.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 875 views
  • 0 likes
  • 4 in conversation