BookmarkSubscribeRSS Feed
Rahul_B
Obsidian | Level 7

Hello,

 

Need help with detecting anomalies in my dataset 

 

data hdfs_kim.node_monitoring;

input  date node1 node2 node3 node4 node5 

datalines ;

01-02-2020 0.45 0.44 0.78 0.32 0.99

02-02-2020 0.34 0.32 0.89 0.56 0.77

03-02-2020 0.89 0.65 0.76 043 0.81

04-02-2020 0.73 1.34 0.66 0.33 0.49

05-02-2020 0.23 0.44  0.54 0.66 0.66

06-02-2020 0.88 0.76 2.56 0.61 0.71

;

run;

 

Can help me find anomalies in this dataset - I have used 

 

proc rpca data=hdfs_kim.node_monitoring
lambdaweight = 3.5
outsparse=hdfs_kim.sparsemat2;
id date;
run;

proc print data=hdfs_kim.sparsemat2;
run;

proc rpca data=hdfs_kim.node_monitoring
scale center;
id date;
anomalydetection;
savestate rstore=hdfs_kim.store;
run;

proc astore;
setoption rpca_projection_type 2;
score rstore=hdfs_kim.store data=hdfs_kim.node_monitoring out=hdfs_kim.scored;
run;

proc print data=hdfs_kim.scored;
run;

 

I was not able to interpret results using RPCA  when i have node1 -16, any help with other process or procedure i am willing try.

 

Thanks 

Rahul 

2 REPLIES 2
Zohreh
SAS Employee

Hi Rahul, 

 

In your output scoring file "scored", you can see the last column which is labeled as "outlier detection score".

Value 1 in that column indicates that  the scoring observation is outlier.

You can read more about the anomaly detection functionality of proc RPCA here: SAS Help Center: The RPCA Procedure 

 

Thanks,

-Zohreh 

Rahul_B
Obsidian | Level 7

Hi,

I ve read the whole procedure thoroughly but the scored dataset does gives anomalies but when it comes to more than variable ( multi dimension ) this  doesnt correctly determines where the anomalies are ? 

 

Any suggestions ?

 

-Thanks 

Rahul 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1097 views
  • 0 likes
  • 2 in conversation