BookmarkSubscribeRSS Feed
Allaluiah
Quartz | Level 8

Hi, I'm trying to generate a market product penetration report to see what customers in my data who principally buy a product and also buy other products along with it. For example, below is report I want to generate and I want to establish the item penetration of a customer. In the given example, out of 20 customers who bought beer, 5 bought crisps, 3 bought fries, 1 bought soda, 1 bought mineral water and 1 bought fast food. Can anybody please suggest me a procedure or a way to generate this kind of a report?

  Customers Beer Crisps Fries Soda Mineral Water Fast food
Beer 20   5 3 1 1 1
Crisps 35 5   1 8 7 4
Fries 32 9 2   20 2 15
Soda 50 15 8 5   2 20
Mineral Water 40 1 10 3 5   6
Fast food 30 5 18 12 24 4  
5 REPLIES 5
Allaluiah
Quartz | Level 8

Hi, Please ignore my previous post as it was not explained properly, here are the details:

Here is my input data:

Product_code Product_name Allied_product Customer_penetration
1 Beer Crisps 8872
1 Beer Fries 337
1 Beer Soda 25503
1 Beer Mineral water 56915
2 Crisps Fries 599
2 Crisps Soda 25230
2 Crisps Mineral water 24801
3 Fries Soda 983
3 Fries Mineral water 1244
4 Soda Mineral water 98661

 

And here is my WANTED OUTPUT:

Product_code Product_Name Beer Crisps Fries Soda Mineral_water
1 Beer   8872 337 25503 56915
2 Crisps 8872   599 25230 24801
3 Fries 337 599   983 1244
4 Soda 25503 25230 983   98661

 

Can anybody help?

MikeZdeb
Rhodochrosite | Level 12

Hi, PROC TRANSPOSE gets you close, but TRANSPOSE will only put each value of the variable "Customer_penetration" into one cell. There are most likely shorter ways, but this works (shortened the variable names, I can't type) ...


data x;
input pc pn :$15. ap &$15. cp;
datalines;
1 Beer Crisps    8872
1 Beer Fries     337
1 Beer Soda     25503
1 Beer Mineral water    56915
2 Crisps Fries    599
2 Crisps Soda    25230
2 Crisps Mineral water    24801
3 Fries Soda    983
3 Fries Mineral water    1244
4 Soda Mineral water    98661
;

 

data y (keep=product_: BEER CRISPS FRIES SODA MINERAL_WATER);
array y(5,5) _temporary_;
array p(5) $15 ('BEER' 'CRISPS' 'FRIES' 'SODA' 'MINERAL WATER');

array c(5) BEER CRISPS FRIES SODA MINERAL_WATER;
set x end=last;
i = whichc(upcase(pn), of p(*)); j = whichc(upcase(ap), of p(*));
y(i,j) = cp; y(j,i) = cp;
if last then
   do product_number=1 to 4;
      product_name = p(product_number);
   do j=1 to 5;
      c(j) = y(product_number,j);
   end;
   output;
   end;
run;


data set Y ...
product_    product_                                         MINERAL_
 number       name       BEER    CRISPS    FRIES     SODA      WATER

    1        BEER           .      8872     337     25503      56915
    2        CRISPS      8872         .     599     25230      24801
    3        FRIES        337       599       .       983       1244
    4        SODA       25503     25230     983         .      98661

 

PROC TRANSPOSE gets you close if you muck around with your data ...


data xx (drop=zz);
set x;
output;
zz = ap; ap = pn; pn = zz;
output;
run;

 

proc sort data=xx;
by pn;
run;

 

proc transpose data=xx out=yy (drop=_name_);
by pn;
id ap;
var cp;
run;

 

data set YY ...

                                                     Mineral_

Obs    pn               Crisps    Fries     Soda      water      Beer

 1     Beer               8872      337    25503      56915         .
 2     Crisps                .      599    25230      24801      8872
 3     Fries               599        .      983       1244       337
 4     Mineral water     24801     1244    98661          .     56915
 5     Soda              25230      983        .      98661     25503

ChrisNZ
Tourmaline | Level 20

Like this?

 

data HAVE(index=(PN));
  input PC PN :$15. AP &$15. CP;
  cards; 
  1 Beer Crisps    8872 
  1 Beer Fries     337 
  1 Beer Soda     25503 
  1 Beer Mineral water    56915 
  2 Crisps Fries    599 
  2 Crisps Soda    25230 
  2 Crisps Mineral water    24801 
  3 Fries Soda    983 
  3 Fries Mineral water    1244 
  4 Soda Mineral water    98661 
run;
data X;
  set HAVE
      HAVE(rename=(AP=PN PN=AP));            %* get 2nd set of data for matrix bottom;
  set HAVE(keep=PN PC) key=PN;               %* get correct PC ;
  if _IORC_ then do; _ERROR_=0; delete; end; %* delete if no PC;
run;
proc tabulate;
 class PC PN AP;
 var CP;
 table PC=''*PN='', AP=''*CP=''*sum=''*f=8.0;
run;

 

 

 

  BeerCrispsFriesMineral waterSoda
1Beer.88723375691525503
2Crisps8872.5992480125230
3Fries337599.1244983
4Soda255032523098398661.

 

Ironic than pasting from the SAS output window yields:

 

Your post has been changed because invalid HTML was found in the message body. The invalid HTML has been removed. Please review the message and submit the message when you are satisfied.

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1314 views
  • 0 likes
  • 3 in conversation