Forgive me if I've posted this in the wrong forum and please suggest a more appropriate site as needed.
I have some customer sales data. I'll keep it simple, two variables: CUSTOMER and ITEM. ITEM is a code specifying what the CUSTOMER purchased during their visit. Each CUSTOMER will have at least one ITEM. ITEM is unique within each CUSTOMER.
What I'd like to know is for each ITEM, how often do each of the other ITEMs get purchased by the same CUSTOMER, expressed in terms of a count or percent.
Is there a way to do this for the entire dataset in one shot using a PROC or do I have to do it one ITEM at a time?
My product set is BASE SAS only, so I hope this doesn't tie my hands too much.
This is a straightforward cluster analysis from data mining. It is in SAS enterprise miner.
If you get SAS IML Studio (part of SAS/Stat), you could use the R-interface to get to the clustering algorithms there.
In Base SAS, one approach with a relatively limited of ITEMs would be to do a TRANSPOSE by CUSTOMER (to get one record per person) and follow-up with FREQ to get all the two-way combinations, save the outputs and order by percent.