I am trying to identify all date fields in my data. It is a challenge because SAS stores them as number, so if I do something like _numeric_, I get both the numbers and the dates. If they were all the same format, contents would work to find them, but there are many acceptable date formats.
Here's an example. How could I catch both DATE which is in the sashelp data and FAKE which I'm creating in a different format?
data test;
set sashelp.stocks;
format xx mmddyy10.;
fake = today();
run;
Use the FMTINFO() function to find the variables that date category formats attached to them.
Use the output of PROC CONTENTS to get the FORMAT name without the width/decimal specication.
proc contents data=HAVE out=contents noprint; run;
proc sql noprint;
select name into :datevar separated by ' '
from contents
where 'date'=fmtinfo(format,'cat')
;
%let ndates=&sqlobs;
quit;
I can think of several ways to identify or guess which fields are dates:
1) As you noticed all dates are NUM type but can be string too (like "20210317" todays date);
2) For NUM type check FORMAT or INFORMAT compared to a long list of possibilities.
3) If no FORMAT nor INFORMAT assigned then guessing is the only possibility:
If you know your data then non missing INTEGERs may be dates if they are in logical range.
It needs checking all the variables' values in all observations of the data set.
Note: dtn = input(dtx,date9.);
4) Check variable names according to expected language, file was created by.
For English - most names will include 'date', 'dt', 'year', 'yy', 'month', 'mon', 'mm' etc..
Human eyes scanning of the variable names are faster then any programming.
5) Check variable labels if exist or any other Metadate or documentation you might have.
6) If you have already programs using this dataset - check how the program relates to the data:
which functions, formats or informats are used with each guessed variable.
If you do not attach a standard date informat or format to the variable, or name it with the word "date" somewhere in the name or have the word "date" in a label there is no way to "identify all".
And if you have character values that look like dates then even more problematic.
You can search variable metadata for the characteristics above in the SASHELP.Volumns view (or dictionary.columns if using Proc SQL).
If you want "guesses" you could look for numeric variables whose values are integer and in an expected range of values that might be dates but I would be very skeptical about the results in general.
Use the FMTINFO() function to find the variables that date category formats attached to them.
Use the output of PROC CONTENTS to get the FORMAT name without the width/decimal specication.
proc contents data=HAVE out=contents noprint; run;
proc sql noprint;
select name into :datevar separated by ' '
from contents
where 'date'=fmtinfo(format,'cat')
;
%let ndates=&sqlobs;
quit;
@Tom syntax error in the code - should be "PROC CONTENTS ..."
proc contents data=HAVE out=contents noprint; run;
proc sql noprint;
select name into :datevar separated by ' '
from contents
where 'date'=fmtinfo(format,'cat')
;
%let ndates=&sqlobs;
quit;
Yes. Thanks. Fixed it now.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.