BookmarkSubscribeRSS Feed
kcskaiser
Fluorite | Level 6

Other than ABM and ProfMan we have only what you can find in base SAS and E.G. I've been asked to find a relationship in some of the data we collect on form usage by role within the company. So my data set is compreised of about 3 million rows consisting of a date/timestamp, a form name and a role (user). We have about 700 unique forms and many thousands of users (roles) that use them.

 

I admiit this was a new request for me, I've not done this kind of study before. I've used this same data to analyze usage by date and day of week, and hour and by role and form name and all the basic kinds of analysis you might think of. But finding a corralation between one form and another by role is not sometihing I've done and really don't know where to start.

 

He likens his request to a grocery store where related products might be grouped together. For example nacho chips would have salsa, and cheese dip on a nearby shelf. So he would like to know what is on the shlef next to Form A, and what else is used near any given Form. 

 

I picture a cloud (the role) filled with forms that they use. This is easy, but how do I say Form A is tighly matched with Form B? And then extend that by saying Form B is also closly matched to Form Z?

 

Most of the examples I see on this and other sites seem well above my understanding. So I'm hoping a simple step by step will help me better understand how to get started. Always keeping in mind I don't have access to many of the really high-end analysis procedures that might make this much easier.

 

Thoughts?

3 REPLIES 3
ballardw
Super User

I think that you are looking for a cluster analysis.

Structure your data such that you have one record per person, possibly within a time frame that makes sense such as day or week, and indicators of forms used, possibly a count of uses within the time frame.

 

Note: if your raw data is user, time, form then proc summary or means to get those counts.

 

Transpose by user and time.

 

Proc fastclus would be a good start to experiment with finding similar groups of usage.

 

 

kcskaiser
Fluorite | Level 6

This is where I run into a problem. I think proc fastclus might be what I need to use but I don't understand the how. But you got me on the right track in my thinking and I've revised my requirments a bit.

 

I need to study form usage across two different time frames. The time frames are Day and Night devided by 6am and 6pm. I'll call that variable tms and make day = 1 and night = 2;

 

Do I then  need to convert each form into a number? Form A becomes 1, Form B becomes 2, etc. and then do the same with each role? Role A becomes 1 and Role B becomes 2 out to the 713 roles in play?

 

Then run;

proc fastclus data = closebyformrole maxc = 713 maxiter=100 out = clus;

   var tms role_num form_num;

run;

 

This produces some output but I can't pretend I know what it all means. One part however looks intreguing to me. Across the top is the role_num 1 to 713 and down the side is 1 to 24, then each role_num has what looks like 1 or more percentages(?) in it. But maybe not because I really don't know what I'm looking at.

 

I know I can't expect someone to teach me about statistics here or even clustering but some tiny guidence would be greatly appreciated.

Thanks.

ballardw
Super User

If you want to find similarities of form usage between roles I think you want to make Role an ID variable and possibly your AM/PM varible.

And add a maxclusters = 10 to the proc statement. Experimenting with values of maxclusters is what I find educational. It will assign each role, or role and time, to a cluster of similarity based on the form values.

If you use a summary procedure such as Proc summary to count of useage per form you might want to standardize the counts using proc standard.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1025 views
  • 0 likes
  • 2 in conversation