BookmarkSubscribeRSS Feed
MarcTC
Obsidian | Level 7

Suppose I have two datasets like below:

CUSTOMER dataset:

customer_id, last_name

001, Natha

002, Cooper

003, David

VISIT dataset:

customer_id, visit_date

001, 01-MAR-2011

001, 03-APR-2011

001, 15-MAY-2011

002, 21-FEB-2011

I want to use one PROC SQL to produce the visit date count for each customer during APR-2011. Here is my query:

proc sql;

  create table report as

  select a.customer_id, count(distinct b.visit_date) as visit_cnt

  from customer a left join visit b on a.customer_id = b.customer_id

  where b.visit_date between '01apr2011'd and '30apr2011'd

  group by a.customer_id

; quit;

This produces:

001, 1

But the output I want is:

001, 1

002, 0

003, 0

Any suggestion for using one PROC SQL to produce the desired result?

8 REPLIES 8
art297
Opal | Level 21

I think that the following will provide what you want:

proc sql;

  create table report as

    select a.customer_id,

     max(count(distinct b.visit_date),0) as visit_cnt

       from customer a

         left join (select visit_date from visit b

           where b.visit_date between '01apr2011'd and '30apr2011'd)

             on a.customer_id = b.customer_id

               group by a.customer_id

;

quit;

SASJedi
SAS Super FREQ

This should do it for you:

proc sql;
   create table report as
   select c.customer_id
          , coalesce(Count,0) as visit_cnt
      from customer c
         left join 
        (SELECT customer_id, count(*) as Count
            from visit 
            where visit_date between '01apr2011'd and '30apr2011'd
            group by 1) v
      on c.customer_id=v.customer_id
; 
quit;
Check out my Jedi SAS Tricks for SAS Users
Ksharp
Super User

Haha.

It has been mentioned by Howles.You should use 'and' instead of 'where' statement which will filter the obs.

proc sql;

  create table report as

  select a.customer_id, count(distinct b.visit_date) as visit_cnt

  from customer a left join visit b on a.customer_id = b.customer_id

  and b.visit_date between '01apr2011'd and '30apr2011'd

  group by a.customer_id

; quit;

data customer;
input customer_id $ last_name $;
cards;
001 Natha
002 Cooper
003 David
;
run;
data VISIT  ;
input customer_id $ visit_date date12.;
format visit_date date9.;
cards;
001 01-MAR-2011
001 03-APR-2011
001 15-MAY-2011
002 21-FEB-2011
;
run; 

proc sql;
  create table report as
  select a.customer_id, count(distinct b.visit_date) as visit_cnt
  from customer a left join visit b on a.customer_id = b.customer_id
  and b.visit_date between '01apr2011'd and '30apr2011'd
  group by a.customer_id
; quit;

Ksharp

MarcTC
Obsidian | Level 7

Ksharp,

Thanks for pointing out this trick.

Actually should we say 'and' enables filtering on visit dataset while 'where' filers on the join results?

Would this query executed in ANSI SQL behave the same?

MarcTC
Obsidian | Level 7

Ksharp, Could you tell me which section of Howes' sqlbook tak about this trick?

http://www.howles.com/sqlbook/

art297
Opal | Level 21

Marc,

Howard mentions it in one of his posts to the forum.  Take a look at: http://communities.sas.com/message/57333#57333

art297
Opal | Level 21

I also should mention that the link you provided was only to the examples from Howard's book.  If you are going to be using SQL, I strongly recommend Howard's book.  He is a master at understanding how to simplify almost any proc sql code.  As for the other question you asked, other than the SAS dates, I would think that the code does indeed conform to ANSI SQL.

Ksharp
Super User

In the url Art.T gave ,There is already answered your question.

Where statement will filter the observation after left join,

'and' will filter the observations during left join ( but filtering will keep all the observation from left dataset, just as left join operation required).

So 'and' in left jion will keep all the observations from left dataset,but where statement will filter observations ,even if obs are from left dataset.

Ksharp

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1694 views
  • 0 likes
  • 4 in conversation