I am very confused with working with and manipulating dates from a data file in SAS. I have a text file that I use as a input file in a data step where dates are listed as 20150819, for example. When I input these, I usually input as a "date_var $8." informat, storing the variable as an 8-digit number, exactly as it appears on my text file input. Sometimes, I will input them as "date_var 8." informat, storing the variable as an 8-digit number. Neither of these informats seem to be helpful, because I get errors trying to use the YRDIF function. The only way I can get the YRDIF function to work is if I input the variables using the yymmdd8. informat.
When I finish all my data manipulation, I want the output displayed as an 8-character date (ie 20150819). Ultimately, I don't want to informat the same variable three different ways just so that I can use the yymmdd8. informat for date calculations.
Can someone please explain the differences between informats and formats for dates and why the manipulation and conversions are necessary?
20150819 needs to be read with the yymmdd8. informat, SAS then stores the date in SAS date format (count of days, with day 1 = 01/01/1960). To make the date human-readable, assign a proper format for output (see the date and time category in the SAS formats documentation).
So, given your initial description, the input data step should look like
data have;
infile cards;
input date yymmdd8.;
format date yymmddn8.;
cards;
20150819
;
run;
Note the 'n' in the output format; it tells SAS to not use a separator between day/month/year.
When you are using the 8. format that means that SAS is storing the field as a number, not a date. SAS stores numbers as hexadecimal so any number should be 8. If you want to do calculations between dates you want the number to be stored as a date, not a number. SAS stores dates as the date difference from jan 1 1960, so without formatting the dates will be numbers but not what you are expecting. There is a wealth of information out there if you google it. Run this code and see if it helps at all, pay note to the formatting:
data dates;
infile cards dsd;
informat num_date 8. char_date $8. mmddyy mmddyy10.;
format num_date 8. char_date $8. mmddyy mmddyy10.;
input num_date char_date mmddyy;
cards;
20150819,20150819,08/19/2015
;
data formats;
format date9 date9. date_date yymmdd8.;
set dates;
date9=mmddyy;
date_date = input(put(num_date,8.),yymmdd8.);
run;
You don't mention how you are importing the dataset but to assign specific informats you likely need a data step program to read them. Easiest is to use proc import to get a skeleton of code which appears in the log, at least with base SAS and assign the informat and formats.
SAS date variables are represented as the number of days since 1 Jan 1960 and to use all of the features that apply to dates you don want to create a SAS date valued variable not a simple numeric as you have been.
The informat that you want to read data in your example would be yymmdd. And to display dates in the same format use the yymmdd. format.
One of the very nice things about having your data as SAS date values is you can use formats to create different analysis without adding or changing variables.
With a data set read with the yymmdd informat for your data take a look at differences in output from similar programs:
Proc freq data=have;
tables Date_Var ;
format Date_var mmddyy10.;
run;
Proc freq data=have;
tables Date_Var ;
format Date_var weekday.;
run;
Proc freq data=have;
tables Date_Var ;
format Date_var year. .;
run;
Proc freq data=have;
tables Date_Var ;
format Date_var yymon..;
run;
And you get similar behaviors with most analysis programs where groups are defined by the format.
20150819 needs to be read with the yymmdd8. informat, SAS then stores the date in SAS date format (count of days, with day 1 = 01/01/1960). To make the date human-readable, assign a proper format for output (see the date and time category in the SAS formats documentation).
So, given your initial description, the input data step should look like
data have;
infile cards;
input date yymmdd8.;
format date yymmddn8.;
cards;
20150819
;
run;
Note the 'n' in the output format; it tells SAS to not use a separator between day/month/year.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.