BookmarkSubscribeRSS Feed
robulon
Quartz | Level 8

Hi,

I'm trying to use a where statement on SAS 9.4 using two date variables but the code is not performing in the way I would expect it to.

I want to retain only the obs where date1 + 2 is less than date2 using the code

date1 + 2 < date2

but when I run the code, outputs include obs where date1 is higher than date2. I have also tried peforming the initial calculation in brackets in case that made any difference

(date1 + 2) < date2

but the outputs are still the same.

If I put the same code in an if statement in the same data step it works fine. I realise I could just use this but I would like to understand why the where statement doesn't work. I've done quite a bit of searching online but can't find anything that might explain it so hoping someone might be able to point me in the right direction.

Many thanks

 

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please post example test data in the form of a datastep, you can get this by following this post:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Then please show your code (just the relevant part) by posting it as text into a code window - which can be opened using the {i} above the post area.

 

Kurt_Bremser
Super User

When you posted your question, you saw this below the posting window:

 

Stop right there! Before pressing POST, tick off this checklist. Does your post …

✔ Have a descriptive subject line, i.e., How do I ‘XYZ’? ✔ Use simple language and provide context? Definitely mention what version you’re on. ✔ Include code and example data? Consider using the SAS Syntax feature.

 

 

 

 

 

Please note the third section. "Code" is more than just a tiny snippet that does not even constitute one SAS statement.

And add the log of the step, as it will contain pertinent information.

robulon
Quartz | Level 8

Apologies, I had worked through the check lst and hoped what I'd provided would have been sufficient for someone to provide a view.

 

The issue I have is that this is a work based piece of code that accesses sensitive data so I am unable to provide the full data step and log.

 

I've edited both the data step and the log but hopefully the gist is still there

 

data testing;

set ####;

where location_code not in ('101020' '101030' '101040' '151010' '151030' '501030')

and work_queue not in ('L05' 'L20' 'L21' 'L22' 'L23')

and bal > 500

and cycles > 0

and (hold_date + 2) < batch_date

and (pending_hold_date + 2) < batch_date

and substr(mobile1,1,2) = '07' or substr(mobile2,1,2) = '07' or substr(mobile3,1,2) = '07'

and dob ne .;

run;

 

NOTE: There were 10674 observations read from the data set ####.

      WHERE (location_code not in ('101020', '101030', '101040', '151010', '151030',

      '501030') and work_queue not in ('L05', 'L20',

      'L21', 'L22', 'L23', 'L30', 'L31', 'P20', 'Q11', 'Q26', 'S28', 'S34', 'S36', 'S37', 'S38',

      'S61') and (bal>500) and (cycles>0) and (batch_date>(hold_date+2)) and

      (batch_date>(pending_hold_date+2)) and (SUBSTR(mobile1, 1, 2)='07')) or

      (SUBSTR(mobile2, 1, 2)='07') or ((SUBSTR(mobile3, 1, 2)='07') and (dob not = .));

NOTE: The data set WORK.TESTING has 10674 observations and 13 variables.

NOTE: Compressing data set WORK.TESTING decreased size by 44.76 percent.

      Compressed is 116 pages; un-compressed would require 210 pages.

NOTE: DATA statement used (Total process time):

      real time           9.78 seconds

      cpu time            1.28 seconds

 

I've attached an extract of the ouputs

 

Thanks again

 

 


Sample data.PNG
Kurt_Bremser
Super User

I abbreviated your code with your data:

data have;
informat pending_hold_date hold_date batch_date date9.;
format pending_hold_date hold_date batch_date date9.;
input pending_hold_date hold_date batch_date;
cards;
. 20-apr-17 29-apr-17
5-may-17 5-may-17 29-apr-17
4-jul-17 4-jul-17 29-apr-17
. 4-may-17 29-apr-17
6-may-17 29-apr-17 29-apr-17
;
run;

data want;
set have;
where
(hold_date + 2) < batch_date
and
(pending_hold_date + 2) < batch_date
;
run;

proc print data=want noobs;
run;

According to the conditions, only obs 1 will be output.

What output did you expect?

robulon
Quartz | Level 8

Thanks Kurt and apologies again for my ignorance.

 

The data extract I'd attached to my message was the outputs from the code so as per your response, what I was expecting was there would only be the 1 ob where the hold_date was 20-apr-17 and the batch_date was 29-apr-17, but my code is outputting all of those obs, including those that don't meet my criteria.

 

Whilst I wouldn't consider my to be anywhere near an expert SAS programmer, I am reasonably competent and would have expected only those obs that meet all the criteria specified in the where statement to be returned in the output but wasn't sure if there was some sort of quirk that I wasn't aware of that means when you are combining character, numeric and date values (which I know are stored numerically) in the same where statement. then SAS may behave differently?

Kurt_Bremser
Super User

It looks like the culprit is not the date comparison, but your use of and and or in the condition:

where
(
location_code not in ('101020' '101030' '101040' '151010' '151030' '501030')
and work_queue not in ('L05' 'L20' 'L21' 'L22' 'L23')
and bal > 500
and cycles > 0
and (hold_date + 2) < batch_date
and (pending_hold_date + 2) < batch_date
and substr(mobile1,1,2) = '07'
)
 or substr(mobile2,1,2) = '07' or
)
substr(mobile3,1,2) = '07'
and dob ne .
)
;

See where I added additional brackets that clarify where the "and before or" priority of boolean logic evaluation kicks in.

So all observations that satisfy substr(mobile2,1,2) = '07' will be included. All other conditions are basically overridden by that.

robulon
Quartz | Level 8

That's fantastic, thanks Kurt.

 

One of those things where as soon as you read the solution, you fel like an idiot for not spotting it yourself!

 

Thanks again,

Rob

Kurt_Bremser
Super User

What you probably wanted was

where
  location_code not in ('101020' '101030' '101040' '151010' '151030' '501030')
  and work_queue not in ('L05' 'L20' 'L21' 'L22' 'L23')
  and bal > 500
  and cycles > 0
  and (hold_date + 2) < batch_date
  and (pending_hold_date + 2) < batch_date
  and (
    substr(mobile1,1,2) = '07'
    or substr(mobile2,1,2) = '07'
    or substr(mobile3,1,2) = '07'
  )
  and dob ne .
;

Note how I use brackets and indentation to visualize the logic.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 995 views
  • 1 like
  • 3 in conversation