BookmarkSubscribeRSS Feed
isas_iam
Fluorite | Level 6

I have a data structure as follow: Customer bought some items.

Customer_ID, ITEM_ID

Adams, 18
Adams, 29
Adams, 30
Allen, 9
Allen, 27
Anderson,24
Anderson,26
Bailey, 7
Bailey, 30
Baker, 7
Baker, 10
Baker, 19
Baker, 31
Barnes, 10
Barnes, 21
Barnes, 22
Barnes, 31
...
etc

How can I write SAS Procedure (PROC CLUSTER? PROC FASTCLU?) to cluster Customers into distinct groups, say Group1, Group2 based on their ITEM_IDs bought?

 

I am using Base SAS 9.4 or SAS EG 6.1.

CSV data is attached.

 

THANKS.

 

3 REPLIES 3
Reeza
Super User

Based on your problem description I think this may be Market Basket Analysis rather than cluster analysis. 

 

MBA is implemented in SAS EM but not Base. If you only have Base there's a macro written that will perform it. You can search for it on lexjansen.com 

 

If you are doing cluster analysis make sure to treat the variables as categorical sonce item 18 and item 17 are not a distance of 1 apart and that distance doesn't have any meaning. 

isas_iam
Fluorite | Level 6

Thanks for responding.

I have SAS EM too and tried to run MBA on my data.

I got some output but not sure how to use them.

 

In my data, I have about 100 customers with multiple purchases (identified by ITEM_IDs).

I am trying to group these 100 customers into 4-5 clusters based on their purchased ITEM_IDs.

 

Should I create a DISTANCE matrix of Customers based on their purchased ITEM_IDs?

 

Thanks for more insights and inputs.

Ksharp
Super User

1) If there are only character variables out there, You can firstly use proc distance to get the distance matrix, and feed it into proc cluster, Search ( character variable cluster ) at support.sas.com , you will get the code.

2) If there are mixed up character and numeric variable, two way I can thing is one is using Decision Tree (proc hpsplit), another

way is general logistic regression (proc logistic or other proc can run logistic regression).

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 838 views
  • 0 likes
  • 3 in conversation