I have a large dataset, many columns contain date and datetime data. I'm searching for a fancy solution where all of the date and datetime columns can be identified and then offset by a pre-determined number. I realize I can output a dataset using proc contents and then grab the date/datetime variables and tuck them into an array. But I'm hoping there's a less labor intensive way to accomplish offsetting dates and datetimes. I'm providing a tiny dataset where I'm offsetting by 100 days. The data I am working with will contain thousands of variables and hundreds of columns will contain dates/datetimes.
data have;
input id dttm number date;
format dttm datetime14. date mmddyy10.;
datalines;
77 1668138559 92345 22265
107 1678148579 78456 22337
923 1964135557 28940 22340
;
run;
data want;
set have;
dttm = dttm + (100*60*60*24);
date = date + 100;
run;
The function you want to "offset" dates, datetimes and time values is INTNX. You provide an interval, the base date, datetime, or time value, the number to "offset" and optionally an alignment parameter to return the Begining, End, Middle or "Same" alignment. "Same" would be such as incrementing by months but what the returned day of the month to be the "same" as the starting date. Common intervals for dates are :day, week, month, year. If you want to increment a Datetime value with those then interval starts with DT to let SAS know that you are using a datetime value: dtday, dtmonth, dtyear for example.
You want to read up as there are some options with multiples such as Year2 is biennial and shifts such as Year.3 (note the period) that treats March 1 (third common subperiod or month for year) as the start.
So to increment a datetime value : intnx('dtdays', variable,100). If you wanted the same time of day: Intnx('dtday', variable,100,'S'). 'E' would return a time portion of 23:59:59 (one second before midnight).
You can get the FORMAT or INFORMAT of a variable using the functions VFORMAT or VFORMATX and VINFORMAT and VINFORMAT
Something like this will get format, or informat information and variable names if that's what you're looking for instead of proc contents.
Most of the time when I hear "thousands of variables" my first though is why.
data something;
set somedataset (obs=1);
array v (*) _numeric_ ;
do i=1 to dim(v);
formatname = vformat( v[i]);
variablename= vname(v[i] );
end;
So you could use code in side that Do loop to increment the value of the variable v[i] based on the value of Formatname.
data have; input id dttm number date; format dttm datetime14. date mmddyy10.; datalines; 77 1668138559 92345 22265 107 1678148579 78456 22337 923 1964135557 28940 22340 ; run; data want; set have; array v (*) _numeric_ ; do i=1 to dim(v); formatname = vformat( v[i]); if formatname =: 'DATETIME' then v[i] = intnx('dtday',v[i],100,'S'); if formatname =: 'MMDDYY' then v[i] = intnx('day',v[i],100,'S'); end; drop i formatname; run;
'S' doesn't really mean much for days but sometimes having it is nice when you decide to change this to 3 months, for example. Then just change day to month and 100 to 3.
If you haven't used it before the =: is a "begins with" comparison. So you don't have to include the actual length of the format. You can use this with the IN operator in case you need to search for YYMMDD, DATE or DDMMYY formats (or informats)
The simplest way is to get lists of the variables. Then you can use ARRAYs in the step that does the actual conversion.
You can use the FMTINFO() function to detect if the variable has a DATE or DATETIME type format attached to it. For that to work you need to bare NAME of the FORMAT, not the format specification with the width and everything. So use PROC CONTENTS to get the variable list.
proc contents data=have noprint out=contents; run;
proc sql noprint;
select case when fmtinfo(format,'cat')='date' then nliteral(name) else ' ' end
, case when fmtinfo(format,'cat')='datetime' then nliteral(name) else ' ' end
into :datevars separated by ' '
, :dtvars separated by ' '
from contents
where type=1 and format ne ' '
;
quit;
Now that you have the variable lists just make two arrays and two DO loops. Include your DO loop index variable into the array definition and the code will run fine if there are no date or no datetime variables. I assume you don't want to increment missing values.
data want;
set have;
array _dttm index &dtvars;
array _date index &datevars;
do index=2 to dim(_dttm);
if not missing(_dttm[index]) then
_dttm[index] = intnx('dtday',_dttm[index],100,'s')
;
end;
do index=2 to dim(_date);
if not missing(_date[index]) then _date[index]=_date[index]+100;
end;
drop index;
run;
Let's run proc compare and see if it worked.
The COMPARE Procedure Comparison of WORK.HAVE with WORK.WANT (Method=EXACT) Value Comparison Results for Variables __________________________________________________________ || Base Compare Obs || dttm dttm Diff. % Diff ________ || _________ _________ _________ _________ || 1 || 10NOV2012 18FEB2013 8640000 0.5179 2 || 06MAR2013 14JUN2013 8640000 0.5149 3 || 29MAR2022 07JUL2022 8640000 0.4399 __________________________________________________________ __________________________________________________________ || Base Compare Obs || date date Diff. % Diff ________ || _________ _________ _________ _________ || 1 || 12/16/20 03/26/21 100.0000 0.4491 2 || 02/26/21 06/06/21 100.0000 0.4477 3 || 03/01/21 06/09/21 100.0000 0.4476 __________________________________________________________
You could also just generate a separate IF/THEN/Assignment statement for each variable. For example by writing them to a file and then %INCLUDING the file.
filename code temp;
data _null_;
file code;
set contents;
where type=1 and format ne ' ';
length nliteral $80;
nliteral=nliteral(name);
select (fmtinfo(format,'cat'));
when ('date') put 'if not missing(' nliteral ') then ' nliteral '=' nliteral '+100;';
when ('datetime') put 'if not missing(' nliteral ') then ' nliteral
'=intnx("dtday",' nliteral ',100,"s");' ;
otherwise;
end;
run;
data want;
set have;
%include code / source2;
run;
Result
212 data want; 213 set have; 214 %include code / source2; NOTE: %INCLUDE (level 1) file CODE is (system-specific pathname). 215 +if not missing(date ) then date =date +100; 216 +if not missing(dttm ) then dttm =intnx("dtday",dttm ,100,"s"); NOTE: %INCLUDE (level 1) ending. 217 run; NOTE: The data set WORK.WANT has 3 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.