Hi folks, this is long, so please forgive, but I'd like to understand best practice around working with date calcs.
We do some intensive querying with SAS and I'm going through and automating some process flows. One of the inconveniences I'm immediatly trying to solve is needing to input numerous specific dates in order to retrieve a desired result when querying. My general appoach to fixing this is to have the user input one or two dates, and create the rest of the desired dates via date-calculation.
I approach date calculation two ways... if the user is unputing "dates" as character strings (e.g. "2015-01-01") Then I substring the string, into yr/mo/day, convert to numeric values, CATT, and date format. I then do some math off of the constituant parts of the character date they are providing (shown below)
DATA Tag_Test;
/* Method One */
/* Specified Date Rage as Character Strings */
Start_D = '01/04/2015';
End_D = '02/12/2015';
Start_Yr = INPUT(SUBSTR(Start_D,7,4),9.);
Start_Mo = INPUT(SUBSTR(Start_D,1,2),9.);
Start_Dy = INPUT(SUBSTR(Start_D,4,2),9.);
End_Yr = INPUT(SUBSTR(End_D,7,4),9.);
End_Mo = INPUT(SUBSTR(End_D,1,2),9.);
End_Dy = INPUT(SUBSTR(End_D,4,2),9.);
LY = Start_Yr - 1;
Start_Date = INPUT(CATT(Start_Mo,'/',Start_Dy,'/',Start_Yr),MMDDYY10.); FORMAT Start_Date Date9.;
End_Date = INPUT(CATT(End_Mo,'/',End_Dy,'/',End_Yr),MMDDYY10.); FORMAT End_Date Date9.;
LY_Start = INPUT(CATT('01','/','01','/',LY),MMDDYY10.); FORMAT LY_Start Date9.;
LY_End = INPUT(CATT('12','/','31','/',LY),MMDDYY10.); FORMAT LY_End Date9.;
TY_Start = INPUT(CATT('01','/','01','/',Start_Yr),MMDDYY10.); FORMAT TY_Start Date9.;
TY_Day2 = INPUT(CATT('01','/','02','/',Start_Yr),MMDDYY10.); FORMAT TY_Day2 Date9.;
TY_Start_Less_1 = INTNX('day',Start_Date,-1); FORMAT TY_Start_Less_1 Date9.;
RUN;
If the user is providing numeric dates within SAS I go with standard DATE functions & INTNX to arrive at the dates needed.
DATA Tag_Test;
/* Method Two */
/* Specified Date Rage from Date Values */
Start_Date = '01FEB2015'd;
FORMAT Start_Date Date9.;
End_Date = '03FEB2015'd;
FORMAT End_Date Date9.;
Start_Yr = Year(Start_Date);
Start_Mo = Month(Start_Date);
Start_Dy = Day(Start_Date);
End_Yr = Year(End_Date);
End_Mo = Month(End_Date);
End_Dy = Day(End_Date);
LY = Start_Yr -1;
LY_Start = INPUT(CATT(01,'/',01,'/',LY),MMDDYY10.); FORMAT LY_Start Date9.;
LY_End = INPUT(CATT(12,'/',31,'/',LY),MMDDYY10.); FORMAT LY_End Date9.;
TY_Start = INPUT(CATT(01,'/',01,'/',Start_Yr),MMDDYY10.); FORMAT TY_Start Date9.;
TY_Day2 = INPUT(CATT(01,'/',02,'/',Start_Yr),MMDDYY10.); FORMAT TY_Day2 Date9.;
TY_Start_Less_1 = INTNX('day',Start_Date,-1); FORMAT TY_Start_Less_1 Date9.;
RUN;
However, then I try to create global macro's to be used in several queries, I run into issues. Mainly, how to do date calculations to create macro variables when I'm not creating variables in a table. The code works fine when SAS understands I'm creating table output. But when I try to code without specifying DATA "table_name"; ......RUN; it does not recognize my date variables as anything but strings. If I were to run the below snippet, and %put LY, it sees LY = "Start_Yr -1", and not the date output.
%Let Start_Date = '01FEB2015'd;
%Let End_Date = '03FEB2015'd;
%Let Start_Yr = Year(Start_Date);
%Let Start_Mo = Month(Start_Date);
%Let Start_Dy = Day(Start_Date);
%Let End_Yr = Year(End_Date);
%Let End_Mo = Month(End_Date);
%Let End_Dy = Day(End_Date);
%Let LY = Start_Yr -1;
%Let LY_Start = INPUT(CATT(01,'/',01,'/',LY),MMDDYY10.);
%Let LY_End = INPUT(CATT(12,'/',31,'/',LY),MMDDYY10.);
%Let TY_Start = INPUT(CATT(01,'/',01,'/',Start_Yr),MMDDYY10.);
%Let TY_Day2 = INPUT(CATT(01,'/',02,'/',Start_Yr),MMDDYY10.);
%Let TY_Start_Less_1 = INTNX('day',Start_Date,-1);
%put &Start_Yr;
There's a couple of things you need to learn:
1. INTNX functions - the fourth parameter is key
2. DATA _NULL_ - allows you to use a data step but not create a data set
3. CALL SYMPUTX() - create macro variables and assign scope.
4. If you're using dates as macro variables you also need to be very explicit - are you creating sas dates, or are you creating dates with a date format that you'll resolve with quotes and the d literal
5. Dates are the number of days from Jan 1, 1960 - you can do math with them, ie add/subtract 1 for some of your calculations.
data _null_;
start_date = '01Feb2015'd;
end_date = '03feb2015'd;
/* 1/1/xx, January first but the year prior to start date*/
date1 = intnx('year', start_date, -1, 'b');
/* 12/31/xx, December 31st but the year prior to the start date*/
date2 = intnx('year', start_date, -1, 'e');
/* 1/1/xx, January first but the current year of start date*/
date3 = intnx('year', start_date, 0, 'b');
/* 1/2/xx, January second but the current year of start date*/
date4 = date3 + 1;
/* start_date -1 day*/
date5 = start_date -1;
*Create macro variables - sas dates;
call symputx('date1', date1, 'g');
call symputx('date2', date2, 'g');
call symputx('date3', date3, 'g');
call symputx('date4', date4, 'g');
call symputx('date5', date5, 'g');
*Create macro variables - Date9 format;
call symputx('date1d', put(date1, date9.), 'g');
call symputx('date2d', put(date2, date9.), 'g');
call symputx('date3d', put(date3, date9.), 'g');
call symputx('date4d', put(date4, date9.), 'g');
call symputx('date5d', put(date5, date9.), 'g');
format date: date9.;
run;
%put Date: &date1. - &date1d.;
%put Date: &date2. - &date2d.;
%put Date: &date3. - &date3d.;
%put Date: &date4. - &date4d.;
%put Date: &date5. - &date5d.;
Hi,
Well lets start with the first part. Why do you need to substring out the date? Why not just input() the string directly into a specified format, in this case:
input(<your_string>,ddmmyy10.);
So there is no need for the first datastep.
For your second datastep, its hard to read all mixed case and no indentation, but I think you are basically taking one day of a date value, not sure why you need all that code?
As for the third part, and this really is one of major bugbears, why use macro? Base SAS is the programming language, it is designed to process and manipulate data. It has built in structures, functions and ways of with all types of data. Now juxtapose to that Macro language, which has only one datatype - string, has virtually no data processing capabilities, and is in most cases far more obfuscated and more complicated than needed? Now there is a place for macro language, and code generation in general, however it is not there to patch every Base SAS coding problem. Remember there are many techniques in programming, from re-structuring your data, putting parameter information in datasets, merging, creating lookups, array processing etc. all deisnged to allow simple access to process the data, so in 99% of standard (i.e. non-library or specific task) Base SAS should be fine to cover the solution. If you can provide examples of where these macro variables might be useful I can show other techinques.
Hi RW9, thanks for the reply, sure I think it's best to just assume I know nothing and start with my user scenerio.
Let's say I have a table with three variables, individual, activity, and timestamp. I wish to filter this table down based on date criteria related to their timestampe.
Given the following,
Start_Date = '01FEB2015'd;
End_Date = '03FEB2015'd;
What is best practice to construct global variables below
/* 1/1/xx, January first but the year prior to start date*/
/* 12/31/xx, December 31st but the year prior to the start date*/
/* 1/1/xx, January first but the current year of start date*/
/* 1/2/xx, January second but the current year of start date*/
/* start_date -1 day*/
RUN;
Well, you can find some code further below. I want to return to the why though. Even if your forced to use macro variables for this, why would you want to create lots of them, some 1 year back, some on a set date etc? Just means more work, do the manipulation at the time it needs to be done. However that being said, macro variables are not good for storage of data - they are just strings so manipulation of them numerically is challenging. Can you not put them in a dataset, merge them on when needed? Would be simpler.
%let start_date='01FEB2015'd; %let end_date='03FEB2015'd; data _null_; call symput('FIRST',cats("01JAN",year(&start_date.)-1)); call symput('SECOND',cats("31DEC",year(&start_date.)-1)); call symput('THIRD',cats("01JAN",year(&start_date.))); call symput('FOURTH',cats("02JAN",year(&start_date.))); run; %put &first.; %put &second.; %put &third.; %put &fourth.;
To me, best practices includes using simple, direct, and easy-to-understand tools. To do that, it would pay to expand your understanding of dates and date functions. Here are some simplifying replacements (applicable to Method 2 but could be adapted as needed to other methods):
LY_Start = INTNX('Year', Start_Dt, -1);
FORMAT LY_Start Date9.;
TY_Day2 = TY_Start + 1;
FORMAT TY_Day2 Date9.;
Note that INTNX also contains a fourth parameter which could be set to "End" of a "Year" time period to get you December 31.
I notice that you are already exposed to INTNX, but using it unnecessarily. This would also be possible:
TY_Start_Less_1 = Start_Date - 1;
FORMAT TY_Start_Less_1 Date9.;
Hi Astounding,
I totally agree everything comes down to fundamentals, and I'm trying to expand mine constantly.
I am indeed familiar with INTNX and have used it extensively in some other queries. To your point I was not using best practice by parsing out consituant date parts when I could just use INTNX.
Given the following,
Start_Date = '01FEB2015'd;
End_Date = '03FEB2015'd;
What is best practice to construct global variables below
/* 1/1/xx, January first but the year prior to start date*/
/* 12/31/xx, December 31st but the year prior to the start date*/
/* 1/1/xx, January first but the current year of start date*/
/* 1/2/xx, January second but the current year of start date*/
/* start_date -1 day*/
RUN;
My opinions ...
* 1/1/xx, January first but the year prior to start date*/
desired_date = intnx('year', start_date, -1);
/* 12/31/xx, December 31st but the year prior to the start date*/
Because I'm sure it works, I would use:
desired_date = intnx('year', start_date, 0) - 1;
But I would have no argument with anyone who looks up and finds the right syntax for fourth parameter to INTNX, using "End".
/* 1/1/xx, January first but the current year of start date*/
desired_date = intnx('year', start_date, 0);
/* 1/2/xx, January second but the current year of start date*/
desired_date = intnx('year', start_date, 0) + 1;
Also note, if you had already computed the January 1 date in the step above, best practice would be:
desired_date = January_1_date + 1;
/* start_date -1 day*/
desired_date = start_date - 1;
This worked great, however one of my assumptions around INTNX was incorrect.
Given the Start_Date was defined as follows:
Start_Date = '08FEB2015'd;
I had assumed LY_Start = INTNX('year',Start_Date,-1); would return '08FEB2014'd, as it takes the date value and modifies the year interval by 1 while holding the day and month constant.
Why does it return '01JAN2014'd; ??
A small adjustment will produce what you are asking for:
Start_Date = '08Feb2015'd;
LY_Start = intnx('year', Start_Date, -1, 'Same');
By default, INTNX gives you the first day of the time period. If your interval is "Year", you will get the first day of the year.
There's a couple of things you need to learn:
1. INTNX functions - the fourth parameter is key
2. DATA _NULL_ - allows you to use a data step but not create a data set
3. CALL SYMPUTX() - create macro variables and assign scope.
4. If you're using dates as macro variables you also need to be very explicit - are you creating sas dates, or are you creating dates with a date format that you'll resolve with quotes and the d literal
5. Dates are the number of days from Jan 1, 1960 - you can do math with them, ie add/subtract 1 for some of your calculations.
data _null_;
start_date = '01Feb2015'd;
end_date = '03feb2015'd;
/* 1/1/xx, January first but the year prior to start date*/
date1 = intnx('year', start_date, -1, 'b');
/* 12/31/xx, December 31st but the year prior to the start date*/
date2 = intnx('year', start_date, -1, 'e');
/* 1/1/xx, January first but the current year of start date*/
date3 = intnx('year', start_date, 0, 'b');
/* 1/2/xx, January second but the current year of start date*/
date4 = date3 + 1;
/* start_date -1 day*/
date5 = start_date -1;
*Create macro variables - sas dates;
call symputx('date1', date1, 'g');
call symputx('date2', date2, 'g');
call symputx('date3', date3, 'g');
call symputx('date4', date4, 'g');
call symputx('date5', date5, 'g');
*Create macro variables - Date9 format;
call symputx('date1d', put(date1, date9.), 'g');
call symputx('date2d', put(date2, date9.), 'g');
call symputx('date3d', put(date3, date9.), 'g');
call symputx('date4d', put(date4, date9.), 'g');
call symputx('date5d', put(date5, date9.), 'g');
format date: date9.;
run;
%put Date: &date1. - &date1d.;
%put Date: &date2. - &date2d.;
%put Date: &date3. - &date3d.;
%put Date: &date4. - &date4d.;
%put Date: &date5. - &date5d.;
@RW9 touched on this and I'm going to get specific:
DATA Tag_Test;
/* Method One */
/* Specified Date Rage as Character Strings */
Start_D = '01/04/2015';
End_D = '02/12/2015';
Start_Yr = INPUT(SUBSTR(Start_D,7,4),9.);
Start_Mo = INPUT(SUBSTR(Start_D,1,2),9.);
Start_Dy = INPUT(SUBSTR(Start_D,4,2),9.);
End_Yr = INPUT(SUBSTR(End_D,7,4),9.);
End_Mo = INPUT(SUBSTR(End_D,1,2),9.);
End_Dy = INPUT(SUBSTR(End_D,4,2),9.);
LY = Start_Yr - 1;
Is far from best practice. Create date values instead of playing with all the substrings.
DATA Tag_Test;
/* Method One */
/* Specified Date Rage as Character Strings */
Start_D = '01/04/2015';
End_D = '02/12/2015';
StartDate = input(start_d, anydtdte12.);
EndDate = input(end_d, anydtdte12.);
format StartDate EndDate mmddyy10.;
;
run;
And you may be suprised what you can accomplish just specifying an appropriate format when working with the values or the date functions in your queries.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.