I've run into a situation where I'm not sure why some code is working:
I'll start by saying that pddate is in datetime20 format in the source:
I should also note that this is in SAS EG. Here is the query:
PROC SQL;
CREATE TABLE WANT AS
SELECT DATEPART(PDDATE) AS PDDATE FORMAT DATE9., [more variables that aren't relevant]
FROM SOURCE
WHERE PDDATE BETWEEN '01JAN2024'd AND '31JAN2024'd;
QUIT;
(I realize that naming the calculated variable the same as the original probably isn't the best practice, but please bear with me.)
My understanding is that if you are calculating a new variable in the SELECT statement, you can only filter by that variable in the HAVING statement, not the WHERE statement. So if I want PDDATE to be a date and not a datetime, I need to use it in a HAVING statement - it has not yet been converted to a date in the WHERE statement. Therefore, I would expect comparing PDDATE to a date (instead of a datetime) in the WHERE statement to result in an error. But it doesn't; I get the desired output, and I'm not sure why. Is my understanding of the interaction between a calculated variable and WHERE/HAVING incorrect? Is SAS EG fixing it behind the scenes? What am I missing here?
Unrelated question: What is the difference between labels and message tags when posting a question?
Which version of PDDATE did you want the WHERE clause to use?
If you want be sure the WHERE clause will use the version of PDDATE that your SELECT list created then add the CALCULATED keyword.
WHERE calculated PDDATE BETWEEN '01JAN2024'd AND '31JAN2024'd
If you want to be sure that the WHERE clause will use the original PDDATE then include the ALIAS. Since the original PDDATE is datetime you will want to include the DATEPART() function.
WHERE datepart(SOURCE.PDDATE) BETWEEN '01JAN2024'd AND '31JAN2024'd
Or perhaps use DATETIME constants instead. If the values of PDDATE include non-zero time of day parts then be careful.
WHERE '01JAN2024:00:00'dt <= SOURCE.PDDATE < '01FEB2024:00:00'dt
It should be the calculated version of PDDATE (i.e. the date, not datetime). Without the calculated keyword, the original (i.e. the datetime) is used, correct? So why doesn't comparing it to a date constant result in an error?
@osbornejo wrote:
It should be the calculated version of PDDATE (i.e. the date, not datetime). Without the calculated keyword, the original (i.e. the datetime) is used, correct? So why doesn't comparing it to a date constant result in an error?
Because both dates and datetimes are stored as numbers, so the comparison is syntactically valid. That the numbers are in vastly different ranges doesn't bother the SQL procedure.
@osbornejo wrote:
Is my understanding of the interaction between a calculated variable and WHERE/HAVING incorrect? Is SAS EG fixing it behind the scenes? What am I missing here?
You're missing the fact that it runs, but not correctly.
Both are numeric variables at the root and the comparison is run but you will not get correct data back. SAS stores dates/datetimes as numbers but the ranges are different. Dates are the the number of days from Jan 1, 1960 and datetimes is the number of seconds from Jan 1, 1960.
So the where range is valid when interpreted as a set of numbers but isn't correct for the comparison you want to make.
(I realize that naming the calculated variable the same as the original probably isn't the best practice, but please bear with me.)
This is an example where this is important because it makes it more difficult to trace the root issue.
As indicated by others, use the CALCULATED keyword to use the new variable in the WHERE statement.
@osbornejo wrote:
I've run into a situation where I'm not sure why some code is working:
I'll start by saying that pddate is in datetime20 format in the source:
I should also note that this is in SAS EG. Here is the query:
PROC SQL; CREATE TABLE WANT AS SELECT DATEPART(PDDATE) AS PDDATE FORMAT DATE9., [more variables that aren't relevant] FROM SOURCE WHERE PDDATE BETWEEN '01JAN2024'd AND '31JAN2024'd; QUIT;
(I realize that naming the calculated variable the same as the original probably isn't the best practice, but please bear with me.)
My understanding is that if you are calculating a new variable in the SELECT statement, you can only filter by that variable in the HAVING statement, not the WHERE statement. So if I want PDDATE to be a date and not a datetime, I need to use it in a HAVING statement - it has not yet been converted to a date in the WHERE statement. Therefore, I would expect comparing PDDATE to a date (instead of a datetime) in the WHERE statement to result in an error. But it doesn't; I get the desired output, and I'm not sure why. Is my understanding of the interaction between a calculated variable and WHERE/HAVING incorrect? Is SAS EG fixing it behind the scenes? What am I missing here?
Unrelated question: What is the difference between labels and message tags when posting a question?
You're missing the fact that it runs, but not correctly.
That's the thing - the results have dates that are in the desired range. I agree that the comparison shouldn't be working, which is why I'm so surprised by the results.
That's interesting. In a mockup test created that did NOT occur. Can you confirm the format of the original variable?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.