DATA Step, Macro, Functions and more

a SQL question

Reply
Frequent Contributor
Posts: 77

a SQL question

Suppose I have a dataset containing the info about product types each customer purchased like this:

/pre

Customer  Product_Type

C1            A

C1            B

C1            C

C2            D

C3            B

C3            E

C4            B

I want to know which customers only purchase products of type B. Is it possible to get this using one PROC SQL?

PROC Star
Posts: 7,486

a SQL question

Marc,

There may be an easier way of expressing it, but I think that the following would do what you want:

proc sql;

  select distinct customer

    from have

      group by customer

        having min(Product_Type) eq "B" and

               max(Product_Type) eq "B"

  ;

quit;

Trusted Advisor
Posts: 1,301

a SQL question

DATA input(SORTEDBY=customer);

INFILE cards;

INPUT customer $ product_type $;

CARDS;

C1            A

C1            B

C1            C

C2            D

C3            B

C3            E

C4            B

;

RUN;

/* using normal datastep */

DATA output(KEEP=customer);

SET input;

BY customer;

IF FIRST.customer THEN

  DO;

   b_buyer=0;

   other_buyer=0;

  END;

  IF product_type='B' THEN b_buyer+1;

  IF product_type NE 'B' THEN other_buyer+1;

IF LAST.customer and b_buyer>0 and other_buyer=0 THEN OUTPUT;

RUN;

/* with proc sql */

PROC SQL;

CREATE TABLE output2 as

SELECT customer /* not sure that having the distinct keyword here is necessary, in this limited case it is not */

   FROM input

  GROUP BY customer

HAVING MIN(product_type='B') and MAX(product_type='B');

QUIT;

SAS Employee
Posts: 104

Re: a SQL question

If your data is large, it could be slightly more efficient to execute only one function.  You could accomplish the task something like this:

/*Make some bigger data to play with*/
DATA Transactions(SORTEDBY=customer);
   /* Customer one has only one transaction */
   Customer=1;
   Product_type='B';
   output;
   do Customer=2 to 10000;
   /* Customers 2-10000 may have several transactions */
      do i=1 to 100;
        select (round(RANUNI(0)*4));
           when (1) Product_type='A';
           when (2) Product_type='B';
           when (3) Product_type='C';
           when (4) Product_type='D';
           otherwise Product_type='';
         end;
   /* Customers 100,1000 and 10000 will only buy product B */
         if Customer in (100,1000,10000) then Product_type='B';
         if Product_type ne '' then output;
      end;
   end;
RUN;

/* Get the unique ID for customers who bought only product_type B */
/* Should include customers 1, 100, 1000 and 10000 */
PROC SQL;
CREATE TABLE B_Customers as
SELECT customer
   FROM Transactions a
   group by customer
   having sum(product_type <> 'B')=0
;
QUIT;

Select distinct is not required in this case.

Super User
Posts: 10,041

Re: a SQL question

DATA input;
INFILE cards;
INPUT customer $ product_type $;
CARDS;
C1            A
C1            B
C1            C
C2            D
C3            B
C3            E
C4            B
C4            B
;
RUN;
proc sql;
 select distinct customer
  from input 
   group by customer
    having count(distinct product_type) eq 1 and product_type eq 'B';
quit;

Ksharp

Ask a Question
Discussion stats
  • 4 replies
  • 164 views
  • 0 likes
  • 5 in conversation