- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've searched the sas community and have tried changing the format of my where statement and can't figure out why sas won't recognize the rows with the date '201801'. I'm a beginner coder and suspect its my lack of fully understanding date formats that is creating the problem.
So I have a date variable 'dob' which is formatted mm/dd/yyyy. I create a new variable 'birthmo' from the dob variable to create a date with the format yyyymm. The variable is created correctly (or so I think) and when I do a frequency on birthmo it shows 44k occurences of 201801. But when I try and filter the data using 'where birthmo = '201801'd;' I get this error:
ERROR: Invalid date/time/datetime constant '201801'd.
ERROR: Syntax error while parsing WHERE clause.
Or if I change the code to read 'where birthmo = 201801;' it comes up with an empty dataset.
CODE:
data cohort ;
set m.mpcf_201801 m.mpcf_201802 m.mpcf_201803;
birthmo = dob;
FORMAT birthmo yymmn6.;
run;
data cohort2;
set cohort;
where birthmo = '201801'd;
run;
Sample DATA:
individual id | dob |
1345 | 10/20/2015 |
315345 | 6/5/2018 |
465 | 9/28/2010 |
11354 | 1/1/2018 |
1341 | 1/23/2018 |
352 | 1/24/2018 |
4652 | 2/8/2013 |
435 | 5/14/2009 |
122 | 1/18/2018 |
1253 | 6/19/2004 |
463 | 7/10/2016 |
78986 | 1/22/2018 |
25123 | 1/2/2012 |
Thank you for your help!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Correcting your Where statement(methinks the following should do)
where put(birthmo,yymmn6. -l) = '201801';
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Correcting your Where statement(methinks the following should do)
where put(birthmo,yymmn6. -l) = '201801';
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's like magic! No need to create the birthmo variable either! Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I have taken trainings regarding dates and I realize how they are stored, but for some reason I can't seem to apply that knowledge to writing code. Once I think I understand, I come across a problem, like today. I'll keep trying to consume the literature and hopefully it will click one day.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Literal values must be in DATE9 or DATE7 appearance:
'01JAN2019'D or '01JAN19'd , a syntax rule because there are so many random collections of digits masquerading as dates. Did you that today , 8 Nov 2019 could be 2019312? or even worse 19312?
AND they must be a complete date, day month and year.
Or explicitly use the formatted value
Where put(birthmo,yymmn6.) = '201810';
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using the
'xxxxxxx'd or "xxxxxxx"d
is the SAS convention for specifying a date literal (like 5 is a numeric literal and "abc" is a character literal).
But there is only one format allowed in a date literal. In your case you would want
where '01jan2010'd <= dob <='31jan2010'd;
The important thing to understand is that is doesn't matter at all what the format of DOB is - the underlying numeric value (number of days after 01jan1060) is the same - it's just displayed differently according to the assigned format of DOB. So specifying the date-literals in the where statement as I have merely inserts the number of days after jan 1, 1960 that represent the dates 01jan2010 and 31jan2010.
You could also have
where year(dob)=2018 and month(dob)=1;
which may be much more intuitive, but requires executing two functions for every observation. Using the date literal bounds requires no such overhead - the actual numeric values are inserted once and for all when your program is being compiled, prior to execution.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------