BookmarkSubscribeRSS Feed
mmm7
Calcite | Level 5

I have a large dataset, many columns contain date and datetime data. I'm searching for a fancy solution where all of the date and datetime columns can be identified and then offset by a pre-determined number. I realize I can output a dataset using proc contents and then grab the date/datetime variables and tuck them into an array. But I'm hoping there's a less labor intensive way to accomplish offsetting dates and datetimes.  I'm providing a tiny dataset where I'm offsetting by 100 days. The data I am working with will contain thousands of variables and hundreds of columns will contain dates/datetimes.

 

data have;
	input id dttm number date; 
	format dttm datetime14. date mmddyy10.; 
	datalines;
77 1668138559 92345 22265 
107 1678148579 78456 22337 
923 1964135557 28940 22340  
;
run;
data want;
	set have;
	dttm = dttm + (100*60*60*24);
	date = date + 100;
run;

 

2 REPLIES 2
ballardw
Super User

The function you want to "offset" dates, datetimes and time values is INTNX. You provide an interval, the base date, datetime, or time value, the number to "offset" and optionally an alignment parameter to return the Begining, End, Middle or  "Same" alignment. "Same" would be such as incrementing by months but what the returned day of the month to be the "same" as the starting date. Common intervals for dates are :day, week, month, year. If you want to increment a Datetime value with those then interval starts with DT to let SAS know that you are using a datetime value: dtday, dtmonth, dtyear for example.

You want to read up as there are some options with multiples such as Year2 is biennial and shifts such as Year.3 (note the period) that treats March 1 (third common subperiod or month for year) as the start.

 

So to increment a datetime value : intnx('dtdays', variable,100). If you wanted the same time of day: Intnx('dtday', variable,100,'S'). 'E' would return a time portion of 23:59:59 (one second before midnight).

 

You can get the FORMAT or INFORMAT of a variable using the functions VFORMAT or VFORMATX and VINFORMAT and VINFORMAT

 

Something like this will get format, or informat information and variable names if that's what you're looking for instead of proc contents.

 

Most of the time when I hear "thousands of variables" my first though is why.

 

data something;

   set somedataset (obs=1);

   array v (*) _numeric_ ;

  do i=1 to dim(v);

     formatname = vformat( v[i]);

     variablename= vname(v[i] );

end;

So you could use code in side that Do loop to increment the value of the variable v[i] based on the value of Formatname.

 

data have;
	input id dttm number date; 
	format dttm datetime14. date mmddyy10.; 
	datalines;
77 1668138559 92345 22265 
107 1678148579 78456 22337 
923 1964135557 28940 22340  
;
run;
data want;
	set have;
   array v (*) _numeric_ ;
   do i=1 to dim(v);

     formatname = vformat( v[i]);
     if formatname =: 'DATETIME' then v[i] = intnx('dtday',v[i],100,'S');
     if formatname =: 'MMDDYY'   then v[i] = intnx('day',v[i],100,'S');
      
     end;
     drop i formatname;
run;

'S' doesn't really mean much for days but sometimes having it is nice when you decide to change this to 3 months, for example. Then just change day to month and 100 to 3.

 

If you haven't used it before the =: is a "begins with" comparison. So you don't have to include the actual length of the format. You can use this with the IN operator in case you need to search for YYMMDD, DATE or DDMMYY formats (or informats)

Tom
Super User Tom
Super User

The simplest way is to get lists of the variables.  Then you can use ARRAYs in the step that does the actual conversion.

 

You can use the FMTINFO() function to detect if the variable has a DATE or DATETIME type format attached to it.  For that to work you need to bare NAME of the FORMAT, not the format specification with the width and everything.  So use PROC CONTENTS to get the variable list.

proc contents data=have noprint out=contents; run;
proc sql noprint;
select case when fmtinfo(format,'cat')='date' then nliteral(name) else ' ' end
 , case when fmtinfo(format,'cat')='datetime' then nliteral(name) else ' ' end
  into :datevars separated by ' '
     , :dtvars separated by ' '
from contents
where type=1 and format ne ' '
;
quit;

Now that you have the variable lists just make two arrays and two DO loops.  Include your DO loop index variable into the array definition and the code will run fine if there are no date or no datetime variables.  I assume you don't want to increment missing values.

data want;
  set have;
  array _dttm index &dtvars;
  array _date index &datevars;
  do index=2 to dim(_dttm);
    if not missing(_dttm[index]) then 
       _dttm[index] = intnx('dtday',_dttm[index],100,'s')
    ;
  end;
  do index=2 to dim(_date);
    if not missing(_date[index]) then _date[index]=_date[index]+100;
  end;
  drop index;
run;

Let's run proc compare and see if it worked.

The COMPARE Procedure
Comparison of WORK.HAVE with WORK.WANT
(Method=EXACT)

Value Comparison Results for Variables

__________________________________________________________
           ||       Base    Compare
       Obs ||       dttm       dttm      Diff.     % Diff
 ________  ||  _________  _________  _________  _________
           ||
        1  ||  10NOV2012  18FEB2013    8640000     0.5179
        2  ||  06MAR2013  14JUN2013    8640000     0.5149
        3  ||  29MAR2022  07JUL2022    8640000     0.4399
__________________________________________________________


__________________________________________________________
           ||       Base    Compare
       Obs ||       date       date      Diff.     % Diff
 ________  ||  _________  _________  _________  _________
           ||
        1  ||   12/16/20   03/26/21   100.0000     0.4491
        2  ||   02/26/21   06/06/21   100.0000     0.4477
        3  ||   03/01/21   06/09/21   100.0000     0.4476
__________________________________________________________

You could also just generate a separate IF/THEN/Assignment statement for each variable.  For example by writing them to a file and then %INCLUDING the file.

filename code temp;
data _null_;
  file code;
  set contents;
  where type=1 and format ne ' ';
  length nliteral $80;
  nliteral=nliteral(name);
  select (fmtinfo(format,'cat'));
    when ('date') put 'if not missing(' nliteral ') then ' nliteral '=' nliteral '+100;';
    when ('datetime') put 'if not missing(' nliteral ') then ' nliteral
      '=intnx("dtday",' nliteral ',100,"s");' ;
    otherwise;
  end;
run;

data want;
  set have;
  %include code / source2;
run;

Result

212  data want;
213    set have;
214    %include code / source2;
NOTE: %INCLUDE (level 1) file CODE is (system-specific pathname).
215 +if not missing(date ) then date =date +100;
216 +if not missing(dttm ) then dttm =intnx("dtday",dttm ,100,"s");
NOTE: %INCLUDE (level 1) ending.
217  run;

NOTE: The data set WORK.WANT has 3 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 676 views
  • 2 likes
  • 3 in conversation