Hi SAS Users,
I have a road block in conduct a project. The problem is how to compare observations within groups. My data set has a column named STORE_ID, which is the ID number of each store in a city. Another column is customer_id, obveriously it is ID of each consumed customer. I want to know the list of Customer_id that consumed not only in store A, but also in store B. I think correlated sub-query might be a possible solution, but don't know how to develop code. Any suggestions and sample codes are highly appreciated.
Thanks,
Harry
Hi, how about INTERSECT in SQL (works for 2 stores , works for 'n' stores) ...
data x;
input store_id :$1. customer_id @@;
datalines;
A 123 B 234 A 789 B 123 A 999 A 234 B 765 C 123
;
title 'CUSTOMERS COMMON TO STORES A & B';
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B';
quit;
title 'CUSTOMERS COMMON TO STORES A & B & C;
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B'
intersect
select customer_id from x where store_id = 'C';
quit;
CUSTOMERS COMMON TO STORES A & B
customer_id
123
234
CUSTOMERS COMMON TO STORES A & B & C
customer_id
123
Post some test data (in the form of a datastep) and required output.
If you are only concerned with those two stores, here's a way:
proc sort data=have;
by customer_id;
run;
data a_only b_only both;
merge have (where=(store_id='A') in=shopped_at_a)
have (where=(store_id='B') in=shopped_at_b);
by customer_id;
if shopped_at_a=0 then output b_only;
else if shopped_at_b=0 then output a_only;
else output both;
run;
Note that this doesn't account for the possibility that a customer shopped more than one time at a store. There are ways to handle that, but we'd need a better picture of what is in your data and what you are trying to achieve.
Good luck.
Hi, how about INTERSECT in SQL (works for 2 stores , works for 'n' stores) ...
data x;
input store_id :$1. customer_id @@;
datalines;
A 123 B 234 A 789 B 123 A 999 A 234 B 765 C 123
;
title 'CUSTOMERS COMMON TO STORES A & B';
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B';
quit;
title 'CUSTOMERS COMMON TO STORES A & B & C;
proc sql;
select customer_id from x where store_id = 'A'
intersect
select customer_id from x where store_id = 'B'
intersect
select customer_id from x where store_id = 'C';
quit;
CUSTOMERS COMMON TO STORES A & B
customer_id
123
234
CUSTOMERS COMMON TO STORES A & B & C
customer_id
123
Try this.
data have;
input Store_id $ Customer_ID;
datalines;
CAL 11542
CAL 11526
CHI 10069
CHI 11433
CHI 11542
GFC 11493
GFC 11526
GFC 10068
KAN 11482
KAN 11493
;
proc sql;
create table store as
select * from have group by customer_id
having count(distinct store_id)>1;
quit;
proc transpose data=store out=want(drop=_Name_);
by customer_id;
var store_id;
run;
proc print data=want;
run;
A more general approach might be just what you need.
proc sort data=have (keep=customer_id store_id) out=full_list nodupkey;
by customer_id store_id;
run;
data full_list;
set full_list;
shopped_there='Y';
run;
proc transpose data=full_list out=want;
by customer_id;
id store_id;
var shopped_there;
run;
It assumes that the store names form valid variable names in SAS, and gives you a "Y" for every store where a customer shopped. You can use that for any combination of subsetting going forward.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.