BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wellj049
Calcite | Level 5

I am having trouble performing a “version” of pattern discovery with SAS Enterprise Miner.  I have a data set with the following basic structure:

 

Data.png

 

The customer (ID) has 14 days (Sequence) to perform one of 3 events X, Y, or Z (Event), where one of the three events must be performed on each sequence.  I would simply like to figure out the most likely outcomes, but i think there are too many combinations to code this out in base sas.  I’m hoping for a table that might look like the following:

 

Table.png

(as in, 15% of the observations follow the first one, 12% of observations follow the second, etc.)

 

I’m somewhat inexperienced with SAS EM and its options (I do most of my work in EG).  I’ve been trying to figure this out with the Association Node (sequence option) and the Path Analysis Node, but they seem to produce results that are not quite what I need, which is to keep event occurrence strictly ordered 1 thru 14 (the results from these nodes don’t keep ‘chain item 1’ on sequence 1, ‘chain item 2’ on sequence 2, etc. - and the max chain items is only 10).

 

Any ideas?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

If I understand what you want correctly, which I may not, you could do this without EM.

 

Transpose your data (Task) from long to wide. 

Create a calculated column that is the concatenation of all events, ie Catx(of event1-event14)

Then run a frequency on that variable.

 

If you want to keep your variables separate you can run a freq like below. You may need to play around with the freq options to get your exact percent calculation or do it in a second step.

 

proc freq data=flipped;
table event1*event2*event3*...Event14 / LIST OUTPCT;
run;

View solution in original post

2 REPLIES 2
Reeza
Super User

If I understand what you want correctly, which I may not, you could do this without EM.

 

Transpose your data (Task) from long to wide. 

Create a calculated column that is the concatenation of all events, ie Catx(of event1-event14)

Then run a frequency on that variable.

 

If you want to keep your variables separate you can run a freq like below. You may need to play around with the freq options to get your exact percent calculation or do it in a second step.

 

proc freq data=flipped;
table event1*event2*event3*...Event14 / LIST OUTPCT;
run;
wellj049
Calcite | Level 5

I had a strong feeling I was overcomplicating things.  Then Reeza came along and proved it.   Your suggestion works perfectly.  Thank you!

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1348 views
  • 1 like
  • 2 in conversation