Crosstab or ?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Crosstab or ?

[ Edited ]

Wondering how I can get my raw data to look like the final product I drafted.  Basically, I have distinct user IDs and product categories, and for each product combination I want the total number of distinct users.  How many user IDs only are present in category A, how many are A+B, A+C, B only, B+C, and so on... The end product would have a mirror image as A+B is the same as B+A...  Any help would be much appreciated!

 

 

raw data example  
    
userabc
1100
2101
3110
4101
    
    
final product example  
    
 abc
a112
b100
c200

Accepted Solutions
Solution
‎01-16-2016 08:52 PM
Grand Advisor
Posts: 17,396

Re: Crosstab or ?

Find one that works and go with that Smiley Wink If you have SAS Enterprise Miner it has a feature for MBA

View solution in original post


All Replies
Grand Advisor
Posts: 17,396

Re: Crosstab or ?

So you'll never see a row with all zero's or all one's?

 

I'm not actually sure how to do this, but I have an idea of how to start perhaps. I'm used to seeing it requested differently and I'll post a link to that solution at the bottom. It would count all occurences of a, ab, ac, bc etc... but that means there's double counting.

 

data step2;
set have;
array vs(*) a b c;
length row_name $2.;
do i=1 to dim(vs);
if vs(i)=1 then row_name=catt(row_name, vname(vs(i)));
end;

if lengthn(row_name)=1 then do; row1=row_name;row2=row_name; end;
else do; row1=substr(row_name, 1, 1); row2=substr(row_name, 2, 1); end;

keep id a b c row1 row2;
run;

proc freq data=step2;
table row1*row2/sparse;
run;

Here's the usual solution I see requested:

https://gist.github.com/statgeek/a5184a4e1678d81e2643

 

 

Occasional Contributor
Posts: 6

Re: Crosstab or ?

[ Edited ]
Theoretically, there could be a row of all 1's (not all 0s, that would mean no user). It's definitely not a distance matrix, but thanks for adding that (it does look similar). I will try out your example, but ideally I would not want the double counting. If a user only had product A, the 1 count would show up in the A x A cell, where if the user had A and B, the count would show up in the A x B (and/or B x A) cell but not also the A x A cell. It's a stumper, but I'm trying to do it as automated and simply as possible :\
Grand Advisor
Posts: 17,396

Re: Crosstab or ?

To get the rest of the combinations you can change my code above. The proc freq will generate the table you want. 

 

if lengthn(row_name)=1 then do; row1=row_name;row2=row_name; output; end;
else do; row1=substr(row_name, 1, 1); row2=substr(row_name, 2, 1); output;

 temp=row1; row1=row2; row2=temp; output; end;

Grand Advisor
Posts: 17,396

Re: Crosstab or ?

People with all 1's will break the code above, but I'll leave that as an exercise for you to fix Smiley Happy 

Occasional Contributor
Posts: 6

Re: Crosstab or ?

I would just add a dummy variable in Smiley Tongue. What exactly are row1 and row2 supposed to be? Sorry, sometimes translating a, b, c into the actual values makes it hard to follow... My A and B and C (and actually D, E, F, G) are actual product names...
Grand Advisor
Posts: 17,396

Re: Crosstab or ?

It doesn't matter what the names are. I called it row1,row2 but it's essentially the row/column identifier in the final output for the row. Is this row belongs in the ab/ba columns. If your names are longer or want to use the labels instead as, use CATX function and scan to separate them instead. 

 

Post better data sample if you want more code, preferably in a data step. 

Grand Advisor
Posts: 17,396

Re: Crosstab or ?

I'm also wondering if your attempting a market basket analysis. If so there's a macro available for Base SAS.
Occasional Contributor
Posts: 6

Re: Crosstab or ?

No, it's not a market basket analysis it's a cross-sell analysis. How many in one product group are also in another product group... so maybe there actually will have to be double counting as it's possible to be in more than 2 categories... Sorry, problem solving as we go.
Occasional Contributor
Posts: 6

Re: Crosstab or ?

Actually, the data I have (before I transposed it) does mirror the market basket example I found online. So, I think the market basket macro actually might do exactly what I need it to and be flexible. Do you have a favorite? Otherwise I'll use whatever I just found. Thanks so much for the suggestion!!
Solution
‎01-16-2016 08:52 PM
Grand Advisor
Posts: 17,396

Re: Crosstab or ?

Find one that works and go with that Smiley Wink If you have SAS Enterprise Miner it has a feature for MBA
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 553 views
  • 0 likes
  • 2 in conversation