Hi,
I'm trying to use a where statement on SAS 9.4 using two date variables but the code is not performing in the way I would expect it to.
I want to retain only the obs where date1 + 2 is less than date2 using the code
date1 + 2 < date2
but when I run the code, outputs include obs where date1 is higher than date2. I have also tried peforming the initial calculation in brackets in case that made any difference
(date1 + 2) < date2
but the outputs are still the same.
If I put the same code in an if statement in the same data step it works fine. I realise I could just use this but I would like to understand why the where statement doesn't work. I've done quite a bit of searching online but can't find anything that might explain it so hoping someone might be able to point me in the right direction.
Many thanks
Please post example test data in the form of a datastep, you can get this by following this post:
Then please show your code (just the relevant part) by posting it as text into a code window - which can be opened using the {i} above the post area.
When you posted your question, you saw this below the posting window:
Stop right there! Before pressing POST, tick off this checklist. Does your post …
✔ Have a descriptive subject line, i.e., How do I ‘XYZ’? | ✔ Use simple language and provide context? Definitely mention what version you’re on. | ✔ Include code and example data? Consider using the SAS Syntax feature. |
Please note the third section. "Code" is more than just a tiny snippet that does not even constitute one SAS statement.
And add the log of the step, as it will contain pertinent information.
Apologies, I had worked through the check lst and hoped what I'd provided would have been sufficient for someone to provide a view.
The issue I have is that this is a work based piece of code that accesses sensitive data so I am unable to provide the full data step and log.
I've edited both the data step and the log but hopefully the gist is still there
data testing;
set ####;
where location_code not in ('101020' '101030' '101040' '151010' '151030' '501030')
and work_queue not in ('L05' 'L20' 'L21' 'L22' 'L23')
and bal > 500
and cycles > 0
and (hold_date + 2) < batch_date
and (pending_hold_date + 2) < batch_date
and substr(mobile1,1,2) = '07' or substr(mobile2,1,2) = '07' or substr(mobile3,1,2) = '07'
and dob ne .;
run;
NOTE: There were 10674 observations read from the data set ####.
WHERE (location_code not in ('101020', '101030', '101040', '151010', '151030',
'501030') and work_queue not in ('L05', 'L20',
'L21', 'L22', 'L23', 'L30', 'L31', 'P20', 'Q11', 'Q26', 'S28', 'S34', 'S36', 'S37', 'S38',
'S61') and (bal>500) and (cycles>0) and (batch_date>(hold_date+2)) and
(batch_date>(pending_hold_date+2)) and (SUBSTR(mobile1, 1, 2)='07')) or
(SUBSTR(mobile2, 1, 2)='07') or ((SUBSTR(mobile3, 1, 2)='07') and (dob not = .));
NOTE: The data set WORK.TESTING has 10674 observations and 13 variables.
NOTE: Compressing data set WORK.TESTING decreased size by 44.76 percent.
Compressed is 116 pages; un-compressed would require 210 pages.
NOTE: DATA statement used (Total process time):
real time 9.78 seconds
cpu time 1.28 seconds
I've attached an extract of the ouputs
Thanks again
I abbreviated your code with your data:
data have;
informat pending_hold_date hold_date batch_date date9.;
format pending_hold_date hold_date batch_date date9.;
input pending_hold_date hold_date batch_date;
cards;
. 20-apr-17 29-apr-17
5-may-17 5-may-17 29-apr-17
4-jul-17 4-jul-17 29-apr-17
. 4-may-17 29-apr-17
6-may-17 29-apr-17 29-apr-17
;
run;
data want;
set have;
where
(hold_date + 2) < batch_date
and
(pending_hold_date + 2) < batch_date
;
run;
proc print data=want noobs;
run;
According to the conditions, only obs 1 will be output.
What output did you expect?
Thanks Kurt and apologies again for my ignorance.
The data extract I'd attached to my message was the outputs from the code so as per your response, what I was expecting was there would only be the 1 ob where the hold_date was 20-apr-17 and the batch_date was 29-apr-17, but my code is outputting all of those obs, including those that don't meet my criteria.
Whilst I wouldn't consider my to be anywhere near an expert SAS programmer, I am reasonably competent and would have expected only those obs that meet all the criteria specified in the where statement to be returned in the output but wasn't sure if there was some sort of quirk that I wasn't aware of that means when you are combining character, numeric and date values (which I know are stored numerically) in the same where statement. then SAS may behave differently?
It looks like the culprit is not the date comparison, but your use of and and or in the condition:
where
(
location_code not in ('101020' '101030' '101040' '151010' '151030' '501030')
and work_queue not in ('L05' 'L20' 'L21' 'L22' 'L23')
and bal > 500
and cycles > 0
and (hold_date + 2) < batch_date
and (pending_hold_date + 2) < batch_date
and substr(mobile1,1,2) = '07'
)
or substr(mobile2,1,2) = '07' or
)
substr(mobile3,1,2) = '07'
and dob ne .
)
;
See where I added additional brackets that clarify where the "and before or" priority of boolean logic evaluation kicks in.
So all observations that satisfy substr(mobile2,1,2) = '07' will be included. All other conditions are basically overridden by that.
That's fantastic, thanks Kurt.
One of those things where as soon as you read the solution, you fel like an idiot for not spotting it yourself!
Thanks again,
Rob
What you probably wanted was
where
location_code not in ('101020' '101030' '101040' '151010' '151030' '501030')
and work_queue not in ('L05' 'L20' 'L21' 'L22' 'L23')
and bal > 500
and cycles > 0
and (hold_date + 2) < batch_date
and (pending_hold_date + 2) < batch_date
and (
substr(mobile1,1,2) = '07'
or substr(mobile2,1,2) = '07'
or substr(mobile3,1,2) = '07'
)
and dob ne .
;
Note how I use brackets and indentation to visualize the logic.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.