I am trying to create a subset of an entire dataset based on year. I've read that i need to specify the date format date9. first in order to do so. However, i'm not so certain. I appreciate the help as to the code i should use. Thank you
@igsteo wrote:
I am trying to create a subset of an entire dataset based on year. I've read that i need to specify the date format date9. first in order to do so. However, i'm not so certain. I appreciate the help as to the code i should use. Thank you
The whole idea of subsetting a data set by year should be avoided (except in rare cases). You can do almost any analysis you want without doing this subsetting. You can use a BY statement, and then obtain analyses for each year. To do this, you will need a variable in a data set that contains the YEAR. You can also use WHERE statements to limit the analysis to specific year.
You do NOT need to reformat a date variable to select a subset based on year. The format does not change the underlying value, and it is the value that you have to filter. Compare the date variable in your datasets to two date constants, namely jan 1 of the year you want and dec 31 of the year.
YOu can specify a date constant in sas using the syntax of a date literal (i.e. 'ddMONyyyy'd):
data want;
set have;
where '01jan2017'd <= date <= '31dec2017'd;
run;
You may get suggestions to use a year function, as in:
data want;
set have;
where year(date) = 2017;
run;
True, it's simpler syntax, but involves more work, because it forces SAS to convert a numeric value prior to the comparison, unlike the date literal approach.
Here are some examples of the syntax for the most common other types of SAS literals:
data _null_;
x=1; /*Numeric literal */
y='A'; /*Character literal*/
tim='13:30:21.123't; /*Time literal, but don't forget to format it to be readable*/
dattime='18jan2003:9:27:05.123'dt; /*Don't forget to format for readability*/
put x= y= tim=time12.3 dattime=datetime22.3;
run;
Before going to too far. How do you want to use the subset?
What is the Current format assigned to your date variable?
Run
Proc Contents data=yourdatasetname;
run;
if you don't know how to check the existing format.
You Picture makes me suspect that you may have character values for your date which means additional work.
@igsteo wrote:
I am trying to create a subset of an entire dataset based on year. I've read that i need to specify the date format date9. first in order to do so. However, i'm not so certain. I appreciate the help as to the code i should use. Thank you
The whole idea of subsetting a data set by year should be avoided (except in rare cases). You can do almost any analysis you want without doing this subsetting. You can use a BY statement, and then obtain analyses for each year. To do this, you will need a variable in a data set that contains the YEAR. You can also use WHERE statements to limit the analysis to specific year.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.