Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Classification Matrix Target

Accepted Solution Solved
Reply
Contributor
Posts: 43
Accepted Solution

Classification Matrix Target

My output from the Train Dataset had missing values -what would be the cause?  See image belowCapture.JPG

 

 


Accepted Solutions
Solution
‎10-28-2016 07:48 PM
SAS Employee
Posts: 67

Re: Classification Matrix Target

RPM actually splits your data into a training and validation datasets.  It does a 50/50 split of the data and it will be a stratified sample using the target (dependent) variable to stratify.

 

You can sample before using RPM.  With only 2% response you may want to take all of those that responded and a sample of those who didn't.

 

For example, if I had a data set with 2% respondents and the dataset had 1000 rows, I would take all 20 respondents and maybe 200 non respondents. This would give me approximately 10% respondents and 90% nonrespondents.  I would suggest if possible to have your respondents represent at least 10-20% of the rows in your data mining dataset.  This should give you more stability in your model.

 

You can also use the decision processing within RPM to indicate the prior probabilities. Here's a paper that shows how to assign prior probabilities https://support.sas.com/resources/papers/proceedings10/113-2010.pdf.  Here's a tip that talks about doing so withing EM https://communities.sas.com/t5/SAS-Communities-Library/Tip-How-to-model-a-rare-target-using-an-overs...

View solution in original post


All Replies
SAS Employee
Posts: 26

Re: Classification Matrix Target

[ Edited ]

100% of the observations in the Train data set that were Target=0 were predicted to be Target=0.  There were no false positives here - thus it is just represented as missing.

 

On the other hand, your false negative rate is really high...you should look into that.

Contributor
Posts: 43

Re: Classification Matrix Target

Posted in reply to BrettWujek

Thank you for you reply,  I have done a proc means on the input and everything appears as I would expect.  I am using RPM - Intermediate in Enterprise Guide.   What should I be looking at?

SAS Employee
Posts: 26

Re: Classification Matrix Target

It's not so much about the inputs in your data set here.  The model is just not good at accurately predicting positive responses.  Perhaps your data set is very imbalanced (is target=1 a rare event?).  In the "Decisions and priors" under the Model section in the RPM UI what are the data proportions for your target?

 

Contributor
Posts: 43

Re: Classification Matrix Target

Posted in reply to BrettWujek

Resp = 1 is 2% -which is pretty typical for a Direct Marketing Campaign.  I have the Prior Probabilities and the Decision Function both set to NONE -as I am not sure how to use them.

SAS Employee
Posts: 26

Re: Classification Matrix Target

Ok - what I might suggest then is oversampling to get a more balanced data set for training (ie more observations with target=1 to learn from) and then set the priors according to the historical expectation (2% for level 1 in your case).  Hopefully this will train a model that can better predict the rare event.

 

Good luck.

Contributor
Posts: 43

Re: Classification Matrix Target

Posted in reply to BrettWujek

I have a 100% Sample as my input.  It looks like RPM though is using a sample and I don't see a way to control that.

Solution
‎10-28-2016 07:48 PM
SAS Employee
Posts: 67

Re: Classification Matrix Target

RPM actually splits your data into a training and validation datasets.  It does a 50/50 split of the data and it will be a stratified sample using the target (dependent) variable to stratify.

 

You can sample before using RPM.  With only 2% response you may want to take all of those that responded and a sample of those who didn't.

 

For example, if I had a data set with 2% respondents and the dataset had 1000 rows, I would take all 20 respondents and maybe 200 non respondents. This would give me approximately 10% respondents and 90% nonrespondents.  I would suggest if possible to have your respondents represent at least 10-20% of the rows in your data mining dataset.  This should give you more stability in your model.

 

You can also use the decision processing within RPM to indicate the prior probabilities. Here's a paper that shows how to assign prior probabilities https://support.sas.com/resources/papers/proceedings10/113-2010.pdf.  Here's a tip that talks about doing so withing EM https://communities.sas.com/t5/SAS-Communities-Library/Tip-How-to-model-a-rare-target-using-an-overs...

Contributor
Posts: 21

Re: Classification Matrix Target

Hi,

If we want a similiar classification matrix target in SAS Eminer, what is the way of doing so.

Actually I am getting the matrix while running RPM(SAS EG to SAS Eminer) , but when modelling in SAS Eminer, I am unable to get the similiar matrix.

 

Regards

Amit Verma

 

SAS Employee
Posts: 67

Re: Classification Matrix Target

Posted in reply to amitvermajhs

In SAS Enterprise Miner you can get the same output as Rapid Predictive Modeler by using the Reporter Node under the Utility Tab.

 

Change the properties for the Reporter node to Style = Default and Nodes=Summary (like below).  This will give you a scorecard and the classification matrix as well as other output.

 

2016-11-03_16-35-28.jpg

 

 


2016-11-03_16-35-28.jpg
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 749 views
  • 0 likes
  • 4 in conversation