BookmarkSubscribeRSS Feed
ajosh
Calcite | Level 5

Hi All,

I am in process of forming a methodology for identification of infrequent aka suspicious association rules through SAS Eminer. The rules should be exclude frequent or strong rules as well as rare rules. Rare rules can occur in case of one off purchases say between a dealer dealing in manufacturing textiles and purchasing some relevant machinery.

Few tweaks that I have done to default settings are: 1) set the lower min support and 2) export close to 0.1 million rules as a part of the output.

Methodology used:

1) use a filter for extracting low confidence through user discretion,

2) use a metric which uses confidence of rule divided by product of support of LHS and RHS,

3) sort the low confidence rules on ascending order of the above metric and select say top 30 to 40% of these rules.

Would like to ask the following questions:

1) is this methodology useful to find infrequent rules,

2) will an additional filter for low support prior to selecting low confidence be of any use or is redundant?

Note that we can give user univariate and bivariate statistics on support, confidence and both taken together. Also a cascading feature can be implemented to enable users know what unique values of key metrics can be possible, basis the filters mentioned above.

Thanks in advance for your suggestions and look foward to hear from you.

Regards,

Aditya.

1 REPLY 1
yeliu
SAS Employee

Hi Aditya,

I would suggest you filter the rules you get from a low support level by using a metric called "interest" defined as below.

 

According to probability theory, X and Y are independent if P(X∪Y)=P(X)P(Y). So the rule X⇒Y is not interesting if supp(X∪Y)≈supp(X)∗supp(Y), which means that a rule is not interesting if its antecedent and consequent are approximately independent. Wu et al. introduces the function interest(X,Y)=|supp(X∪Y)−supp(X)supp(Y)|. If interest(X,Y)≥min_interest, where min_interest is a predefined threshold, then itemset X∪Y is referred to as a potentially interesting itemset.

 

Hope it helps,

Ye

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 878 views
  • 0 likes
  • 2 in conversation