BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Benn
Calcite | Level 5

Hi Guys,

I am very new to SAS. Hope to get some advice for the following.

I currently have decades of data. And they have dates in the format of YYYYMMDD. On every date there are customers (eg. A, B, C, D, E).

What I am trying to do is to extract the observations for each customer (A,B,C,D,E) for the last 2 days of each month.

I am not sure how to get about doing this in an efficient manner.

Thanks guys!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

If you only care about calendar date, here is an option:

data want;

  set have;

    if Your_date = intnx('month',Your_date,0,'end') or

       Your_date=intnx('month',Your_date,0,'end')-1;

run;

Haikuo

View solution in original post

5 REPLIES 5
Haikuo
Onyx | Level 15

If you only care about calendar date, here is an option:

data want;

  set have;

    if Your_date = intnx('month',Your_date,0,'end') or

       Your_date=intnx('month',Your_date,0,'end')-1;

run;

Haikuo

RichardinOz
Quartz | Level 8

If it is a large dataset you might want to use a where statement with a single inequality condition for efficiency, as in:

data want ;

  set have ;

    where Your_date > intnx('month',Your_date,0,'end') - 2 ;

run ;

The where statement can also be coded as a (where= ()) dataset option

  set have

       (where = (Your_date > intnx('month',Your_date,0,'end') - 2))

       ;


MikeZdeb
Rhodochrosite | Level 12

hi ... not much difference in performance between IF and WHERE any more ...

"Performance in these examples is close enough that the choice of an IF statement versus a WHERE statement versus a WHERE option is arbitrary."

from ...

Efficiency Considerations Using the SAS System

Rick Langston, SAS Institute

http://www2.sas.com/proceedings/sugi30/002-30.pdf

RichardinOz
Quartz | Level 8

Point taken, Mike.

But Rick goes on to say

"However, if the subsetting is to be performed using a LIBNAME engine against

a database that is optimized for WHERE processing, then the WHERE choice is preferred. "

This is usually the case for the data I work with.  I would only use a subsetting IF when the data step has to perform intermediate calculations before entire rows can be accepted or rejected.

My other objection is aesthetic.  The WHERE syntax conforms to SQL standards and is in my view more readable.  An IF statement without an explicit THEN can be confusing.  I would rather reverse the condition to make it more explicit:

    IF (reject condition is met) THEN DELETE ;

Haikuo
Onyx | Level 15

Not necessarily.  Art and I had done some tests before, and for many cases, 'if' is faster than 'where', and the benefit becomes more evident when hit rate is higher and the table is larger. I believe it is due to data step enhencement on sequential processing. Random access such as 'where' may bear an edge when hit rate is less than 2-4%, and of course, when there is an index.

Haikuo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1124 views
  • 1 like
  • 4 in conversation