My output from the Train Dataset had missing values -what would be the cause? See image below
RPM actually splits your data into a training and validation datasets. It does a 50/50 split of the data and it will be a stratified sample using the target (dependent) variable to stratify.
You can sample before using RPM. With only 2% response you may want to take all of those that responded and a sample of those who didn't.
For example, if I had a data set with 2% respondents and the dataset had 1000 rows, I would take all 20 respondents and maybe 200 non respondents. This would give me approximately 10% respondents and 90% nonrespondents. I would suggest if possible to have your respondents represent at least 10-20% of the rows in your data mining dataset. This should give you more stability in your model.
You can also use the decision processing within RPM to indicate the prior probabilities. Here's a paper that shows how to assign prior probabilities https://support.sas.com/resources/papers/proceedings10/113-2010.pdf. Here's a tip that talks about doing so withing EM https://communities.sas.com/t5/SAS-Communities-Library/Tip-How-to-model-a-rare-target-using-an-overs...
Catch the SAS Global Forum keynotes, announcements, and tech content!
sasglobalforum.com | #SASGF
100% of the observations in the Train data set that were Target=0 were predicted to be Target=0. There were no false positives here - thus it is just represented as missing.
On the other hand, your false negative rate is really high...you should look into that.
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
Thank you for you reply, I have done a proc means on the input and everything appears as I would expect. I am using RPM - Intermediate in Enterprise Guide. What should I be looking at?
It's not so much about the inputs in your data set here. The model is just not good at accurately predicting positive responses. Perhaps your data set is very imbalanced (is target=1 a rare event?). In the "Decisions and priors" under the Model section in the RPM UI what are the data proportions for your target?
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
Resp = 1 is 2% -which is pretty typical for a Direct Marketing Campaign. I have the Prior Probabilities and the Decision Function both set to NONE -as I am not sure how to use them.
Ok - what I might suggest then is oversampling to get a more balanced data set for training (ie more observations with target=1 to learn from) and then set the priors according to the historical expectation (2% for level 1 in your case). Hopefully this will train a model that can better predict the rare event.
Good luck.
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
I have a 100% Sample as my input. It looks like RPM though is using a sample and I don't see a way to control that.
RPM actually splits your data into a training and validation datasets. It does a 50/50 split of the data and it will be a stratified sample using the target (dependent) variable to stratify.
You can sample before using RPM. With only 2% response you may want to take all of those that responded and a sample of those who didn't.
For example, if I had a data set with 2% respondents and the dataset had 1000 rows, I would take all 20 respondents and maybe 200 non respondents. This would give me approximately 10% respondents and 90% nonrespondents. I would suggest if possible to have your respondents represent at least 10-20% of the rows in your data mining dataset. This should give you more stability in your model.
You can also use the decision processing within RPM to indicate the prior probabilities. Here's a paper that shows how to assign prior probabilities https://support.sas.com/resources/papers/proceedings10/113-2010.pdf. Here's a tip that talks about doing so withing EM https://communities.sas.com/t5/SAS-Communities-Library/Tip-How-to-model-a-rare-target-using-an-overs...
Catch the SAS Global Forum keynotes, announcements, and tech content!
sasglobalforum.com | #SASGF
Hi,
If we want a similiar classification matrix target in SAS Eminer, what is the way of doing so.
Actually I am getting the matrix while running RPM(SAS EG to SAS Eminer) , but when modelling in SAS Eminer, I am unable to get the similiar matrix.
Regards
Amit Verma
In SAS Enterprise Miner you can get the same output as Rapid Predictive Modeler by using the Reporter Node under the Utility Tab.
Change the properties for the Reporter node to Style = Default and Nodes=Summary (like below). This will give you a scorecard and the classification matrix as well as other output.
Catch the SAS Global Forum keynotes, announcements, and tech content!
sasglobalforum.com | #SASGF
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.