10-17-2015 08:42 AM
Hi, I'm trying to generate a market product penetration report to see what customers in my data who principally buy a product and also buy other products along with it. For example, below is report I want to generate and I want to establish the item penetration of a customer. In the given example, out of 20 customers who bought beer, 5 bought crisps, 3 bought fries, 1 bought soda, 1 bought mineral water and 1 bought fast food. Can anybody please suggest me a procedure or a way to generate this kind of a report?
|Customers||Beer||Crisps||Fries||Soda||Mineral Water||Fast food|
10-18-2015 12:30 PM
Hi, Please ignore my previous post as it was not explained properly, here are the details:
Here is my input data:
And here is my WANTED OUTPUT:
Can anybody help?
10-18-2015 06:44 PM - edited 10-18-2015 09:13 PM
Hi, PROC TRANSPOSE gets you close, but TRANSPOSE will only put each value of the variable "Customer_penetration" into one cell. There are most likely shorter ways, but this works (shortened the variable names, I can't type) ...
input pc pn :$15. ap &$15. cp;
1 Beer Crisps 8872
1 Beer Fries 337
1 Beer Soda 25503
1 Beer Mineral water 56915
2 Crisps Fries 599
2 Crisps Soda 25230
2 Crisps Mineral water 24801
3 Fries Soda 983
3 Fries Mineral water 1244
4 Soda Mineral water 98661
data y (keep=product_: BEER CRISPS FRIES SODA MINERAL_WATER);
array y(5,5) _temporary_;
array p(5) $15 ('BEER' 'CRISPS' 'FRIES' 'SODA' 'MINERAL WATER');
array c(5) BEER CRISPS FRIES SODA MINERAL_WATER;
set x end=last;
i = whichc(upcase(pn), of p(*)); j = whichc(upcase(ap), of p(*));
y(i,j) = cp; y(j,i) = cp;
if last then
do product_number=1 to 4;
product_name = p(product_number);
do j=1 to 5;
c(j) = y(product_number,j);
data set Y ...
product_ product_ MINERAL_
number name BEER CRISPS FRIES SODA WATER
1 BEER . 8872 337 25503 56915
2 CRISPS 8872 . 599 25230 24801
3 FRIES 337 599 . 983 1244
4 SODA 25503 25230 983 . 98661
PROC TRANSPOSE gets you close if you muck around with your data ...
data xx (drop=zz);
zz = ap; ap = pn; pn = zz;
proc sort data=xx;
proc transpose data=xx out=yy (drop=_name_);
data set YY ...
Obs pn Crisps Fries Soda water Beer
1 Beer 8872 337 25503 56915 .
2 Crisps . 599 25230 24801 8872
3 Fries 599 . 983 1244 337
4 Mineral water 24801 1244 98661 . 56915
5 Soda 25230 983 . 98661 25503
10-19-2015 01:01 AM - edited 10-19-2015 01:05 AM
data HAVE(index=(PN)); input PC PN :$15. AP &$15. CP; cards; 1 Beer Crisps 8872 1 Beer Fries 337 1 Beer Soda 25503 1 Beer Mineral water 56915 2 Crisps Fries 599 2 Crisps Soda 25230 2 Crisps Mineral water 24801 3 Fries Soda 983 3 Fries Mineral water 1244 4 Soda Mineral water 98661 run; data X; set HAVE HAVE(rename=(AP=PN PN=AP)); %* get 2nd set of data for matrix bottom; set HAVE(keep=PN PC) key=PN; %* get correct PC ; if _IORC_ then do; _ERROR_=0; delete; end; %* delete if no PC; run; proc tabulate; class PC PN AP; var CP; table PC=''*PN='', AP=''*CP=''*sum=''*f=8.0; run;
Ironic than pasting from the SAS output window yields:
Your post has been changed because invalid HTML was found in the message body. The invalid HTML has been removed. Please review the message and submit the message when you are satisfied.