Hello,
I want to obtain date difference in year between two dates in YYMMDDN8. format
I write this code but it result in an empty Colum for age
Age = yrdif(FIRST_OFFER_DT, caldt, 'AGEYear');
If you run PROC CONTENTS on the dataset, are FIRST_OFFER_DT and CALDT numeric or character?
If they are character, you will need to use the INPUT function to convert them into SAS dates before doing any calculations.
For difference in years between two dates, you'll also think about how you want to define a year. See the BASIS option to YRDIF, or look at INTCK().
Can you post the log from running your code? 'AGEyear' is not a valid value for the third parameter, perhaps you want 'AGE'.
As is, you should see bad messages in the log:
NOTE: Invalid argument to function YRDIF... NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.
NOTE: Invalid argument to function YRDIF(9104,9190,'AGEYear') at ligne 78 colonne 7. Rank_NO=1 CRSP_FUNDNO=000053 MRET=1.000624 CALDT=19850228 FIRST_OFFER_DT=19841204 Age=. _ERROR_=1 _N_=
If your "date" is actually the number
19850228
(something close to 2 million), then this is way beyond the range of dates the SAS functions are meant to deal with. You need to first convert such values to real SAS dates:
caldt = input(put(caldt,z8.),yymmdd8.);
format caldt yymmdd10.;
53 data JOINT_FO1;
54 set JOINT_FO;
55 caldt = input(put(caldt,z8.),yymmdd8.);
56 format caldt yymmdd10.;
57 run;
NOTE: Invalid argument to function INPUT at ligne 55 colonne 9.
Rank_NO=1 CRSP_FUNDNO=000053 MRET=1.024799 CALDT=. FIRST_OFFER_DT=19841204 _ERROR_=1 _N_=31
Since it first happens in observation 31, it lets me suspect the previous dates were convertible. What does PROC CONTENTS say about the date variables you start with?
this is th output of proc contents
4 | CALDT | Num. | 8 | YYMMDD10. | YYMMDD8. | Date |
---|---|---|---|---|---|---|
2 | CRSP_FUNDNO | Num. | 8 | Z6. | 6. | Fund Identifier |
5 | FIRST_OFFER_DT | Num. | 8 | YYMMDDN8. | Date the Fund Was First Offered (from header data) | |
3 | MRET | Num. | 8 | 9.6 | 9.6 | Total Return per Share as of Month End |
1 | Rank_NO | Num. | 8 |
This means no conversion is needed; you should apply the YYMMDD10. format to all your date variables to avoid confusion.
Both the YRDIF and INTCK functions should work, as long as the parameters are valid. Missing input values will result in missing output values.
It looks like you have valid SAS date values, so that's good news. I think the only problem is that you used 'AGEyear' as the third argument rather than 'Age'. When I run with your values:
data _null_ ;
FIRST_OFFER_DT="04Dec1984"d ;
CALDT="28Feb1985"d ;
Age = yrdif(FIRST_OFFER_DT, CALDT, 'AGE');
put (FIRST_OFFER_DT CALDT AGE)(=) ;
format CALDT FIRST_OFFER_DT yymmddn8. ;
run ;
I get:
1 data _null_ ; 2 FIRST_OFFER_DT="04Dec1984"d ; 3 CALDT="28Feb1985"d ; 4 5 Age = yrdif(FIRST_OFFER_DT, CALDT, 'AGE'); 6 7 put (FIRST_OFFER_DT CALDT AGE)(=) ; 8 format CALDT FIRST_OFFER_DT yymmddn8. ; 9 run ; FIRST_OFFER_DT=19841204 CALDT=19850228 Age=0.2356164384
You cannot calculate an age from missing values.
Why not just only attempt it when you have both date values?
You have records in your data where CALDT is missing. If you expect this to happen, and want to have AGE be missing when this happens, you can make the calculation of AGE conditional on both FIRST_OFFER_DT and CALDT being non-missing, like:
if n(FIRST_OFFER_DT, CALDT)=2 then Age = yrdif(FIRST_OFFER_DT, CALDT, 'AGE');
Or you could impute a value for CALDT when it is missing.
But if you're surprised that CALDT is sometimes missing, then you should go back in your program / data earlier to figure out why CALDT is sometimes missing.
It does not matter what format you have attached to the variable to control how it prints. But it does need to have actual DATE values.
You need to use the AGE option, which will calculate age in years, since there is no AGEYEAR option.
845 data test; 846 first_offer_dt = '01JAN2019'd; 847 caldt = today(); 848 Age = yrdif(FIRST_OFFER_DT, caldt, 'AGE'); 849 put (_all_) (=/); 850 put (first_offer_dt caldt) (=date9.); 851 put (first_offer_dt caldt) (=yymmdd10.); 852 run; first_offer_dt=21550 caldt=23240 Age=4.6273972603 first_offer_dt=01JAN2019 caldt=18AUG2023 first_offer_dt=2019-01-01 caldt=2023-08-18
If you want to eliminate the fractional part of the result you can use INT() to get normal age result. Or perhaps use ROUND() if you want something different.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.