Help using Base SAS procedures

Compare observations within groups

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Compare observations within groups

Hi SAS Users,

I have a road block in conduct a project. The problem is how to compare observations within groups. My data set has a column named STORE_ID, which is the ID number of each store in a city. Another column is customer_id, obveriously it is ID of each consumed customer. I want to know the list of Customer_id that consumed not only in store A, but also in store B. I think correlated sub-query might be a possible solution, but don't know how to develop code. Any suggestions and sample codes are highly appreciated.

Thanks,

Harry 

 


Accepted Solutions
Solution
‎10-29-2015 02:30 PM
Valued Guide
Posts: 765

Re: Compare observations within groups

Hi, how about INTERSECT in SQL (works for 2 stores , works for 'n' stores) ...

 

data x;
input store_id :$1. customer_id @@;
datalines;
A 123 B 234 A 789 B 123 A 999 A 234 B 765 C 123
;

 

title 'CUSTOMERS COMMON TO STORES A & B';
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B';
quit;

 

title 'CUSTOMERS COMMON TO STORES A & B & C;
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B'
intersect
select customer_id from x where store_id = 'C';
quit;


CUSTOMERS COMMON TO STORES A & B

customer_id
        123
        234

 

CUSTOMERS COMMON TO STORES A & B & C

customer_id
        123

View solution in original post


All Replies
Super User
Posts: 5,257

Re: Compare observations within groups

I think that a group by with a having count(*) =2 clause would work, given that the store and customer the combined key.
Data never sleeps
Super User
Posts: 17,840

Re: Compare observations within groups

Do you only have two stores or multiple stores?
Super User
Super User
Posts: 7,405

Re: Compare observations within groups

Post some test data (in the form of a datastep) and required output.

Super User
Posts: 5,085

Re: Compare observations within groups

If you are only concerned with those two stores, here's a way:

 

proc sort data=have;

by customer_id;

run;

 

data a_only b_only both;

   merge have (where=(store_id='A') in=shopped_at_a)

              have (where=(store_id='B') in=shopped_at_b);

   by customer_id;

   if shopped_at_a=0 then output b_only;

   else if shopped_at_b=0 then output a_only;

   else output both;

run;

 

Note that this doesn't account for the possibility that a customer shopped more than one time at a store.  There are ways to handle that, but we'd need a better picture of what is in your data and what you are trying to achieve.

 

Good luck.

Contributor
Posts: 23

Re: Compare observations within groups

Thanks so much for everybody's suggestion. I have hundreds of stores and need to compare any interested stores (usually less than 4 stores). Attached is a sample of data set. Astounding's code looks good, but will be time consuming, since I have hundred stores.
Here is my data set,
Store_id Customer_ID
CAL 11542
CAL 11526
CHI 10069
CHI 11433
CHI 11542
GFC 11493
GFC 11526
GFC 10068
KAN 11482
KAN 11493

If I compare CAL with CHI, the result should be 11542. IF between CAL and GFC, the result should be 11526. If between GFC and KAN, the result should be 11493. If between CAL and KAN, result should be nothing.
Thanks,
Harry
Solution
‎10-29-2015 02:30 PM
Valued Guide
Posts: 765

Re: Compare observations within groups

Hi, how about INTERSECT in SQL (works for 2 stores , works for 'n' stores) ...

 

data x;
input store_id :$1. customer_id @@;
datalines;
A 123 B 234 A 789 B 123 A 999 A 234 B 765 C 123
;

 

title 'CUSTOMERS COMMON TO STORES A & B';
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B';
quit;

 

title 'CUSTOMERS COMMON TO STORES A & B & C;
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B'
intersect
select customer_id from x where store_id = 'C';
quit;


CUSTOMERS COMMON TO STORES A & B

customer_id
        123
        234

 

CUSTOMERS COMMON TO STORES A & B & C

customer_id
        123

Contributor
Posts: 23

Re: Compare observations within groups

Thanks everybody for various suggestions and helps. I found all code work very well for my project but this one is the most efficient. The only reason is that my data set is complex and there will be tons of comparison.
I appreciate everybody's time.
Thanks,
Harry
Trusted Advisor
Posts: 1,204

Re: Compare observations within groups

Try this.

 

data have;
input Store_id $ Customer_ID;
datalines;
CAL 11542
CAL 11526
CHI 10069
CHI 11433
CHI 11542
GFC 11493
GFC 11526
GFC 10068
KAN 11482
KAN 11493
;

proc sql;
create table store as
select * from have group by customer_id
having count(distinct store_id)>1;
quit;

 

proc transpose data=store out=want(drop=_Name_);
by customer_id;
var store_id;
run;

 

proc print data=want;
run;

Super User
Posts: 5,085

Re: Compare observations within groups

A more general approach might be just what you need.

 

proc sort data=have (keep=customer_id store_id) out=full_list nodupkey;

   by customer_id store_id;

run;

 

data full_list;

   set full_list;

   shopped_there='Y';

run;

 

proc transpose data=full_list out=want;

   by customer_id;

   id store_id;

   var shopped_there;

run;

 

It assumes that the store names form valid variable names in SAS, and gives you a "Y" for every store where a customer shopped.  You can use that for any combination of subsetting going forward.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 558 views
  • 2 likes
  • 7 in conversation