Desktop productivity for business analysts and programmers

Clusteranalysis / Outlieranalysis in SAS EG?

Reply
Occasional Contributor
Posts: 9

Clusteranalysis / Outlieranalysis in SAS EG?

[ Edited ]

Hello Community, 

 

I have a set of transaction data and what i like to do is an analysis for outliers in the goods, that a customer buys.

 

For example:

 

Transaction 1 -> Customer A buys Bananas

Transaction 2 -> Customer A buys Bananas

Transaction 3 -> Customer A buys Bananas

Transaction 4 -> Customer A buys Apples -> as outlier

 

 

Is that possible in SAS enterprise guide? Can you pls help me?

 

Thanks and regards,

Mariam

 

PROC Star
Posts: 1,333

Re: Clusteranalysis / Outlieranalysis in SAS EG?

Posted in reply to Mairam2345

Yes, it's fairly easy.

 

First, use the Query Builder to create a new query on your data. Pull your CustomerID and FruitID into the "Select Data" columns, and pull in CustomerID a second time. On your second CustomerID field, set the Summary to "count", and the summary groups box should show that you're summarizing on CustomerID and FruitID. You shouldn't need to do any code, but the underlying code should look like this (I used the Region and Product fields in SASHELP.SHOES):

 

PROC SQL;
   CREATE TABLE WORK.QUERY_FOR_SHOES AS 
   SELECT t1.Region, 
          t1.Product, 
          /* COUNT_of_Region */
            (COUNT(t1.Region)) AS COUNT_of_Region
      FROM SASHELP.SHOES t1
      GROUP BY t1.Region,
               t1.Product;
QUIT;

When you run it, you'll end up with unique records for CustomerID and FruitID combinations, and the count for that combination. You can then use the "Describe" tools to analyze the results, and additional queries to separate out what you consider to be outliers (in your example, count = 1).

 

Tom

 

Occasional Contributor
Posts: 9

Re: Clusteranalysis / Outlieranalysis in SAS EG?

Hi Tom,

 

this works perfectly. Thank you very much.

 

Now I have a new colum summarizing the transactions of both ID's. What I try now is to build a new colum or a visualization where you directly can see if there is an outlier or not.

 

For example:

CustID | Good | Outlier_Check

Cust1 | Banana | 0

Cust1 | Banana | 0

Cust | Apple | 1 ->outlier

 

Is there a possbility to check if one customer (column CustID) always buys the same product (column Good) and if the column goods for the same customer is different there should be a 1 as outliert in a new column?

 

Thanks and regards,
Mariam  

 

As you suggested, i tried to solve this with the describe toolz but this doesn't work.

 

Ask a Question
Discussion stats
  • 2 replies
  • 173 views
  • 0 likes
  • 2 in conversation