BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pvareschi
Quartz | Level 8

Re: "Applied Analytics Using SAS Enterprise Miner", "Lesson 9: Special Topics Using SAS Enterprise Miner" (Understanding Surrogate Models)

Would it be possible to elaborate a bit further the use of a Frequency variable to adjust a dataset for oversampling?

The SAS Enterprise Miner Reference Help seems to discourage the use of sampling weights (page 190 "the current version of SAS Enterprise Miner does not provide full support for sampling weights or other types of weighted analyses, so this method should be approached with care"), moreover, as I understand it, sampling weights, on their own, will not affect the cut-off used to derive the misclassification rate so, the assessment will still be somehow misleading.

Would a better approach be based on using prior probabilities and Decision Weights set to the inverse of priors to ensure the cut-off is also set to the actual population proportion of primary events?

 

1 ACCEPTED SOLUTION

Accepted Solutions
gcjfernandez
SAS Employee

The surrogate model example is providing a solution to assess the variable importance of the neural net (black box model). If you want to make decision or prediction use the posterior probability values derived from the NN model directly.  Because the posterior Probability reported in the surrogate decision tree model is not adjusted for over-sample or priors.

Therefore, in the course notes in case you need to use the posterior probabilities from the surrogate model, they provide the following solutions:

1) Use the scored data where event distribution reflects what is available in the reference population.

2) You could also use SAS code editor in EM and adjust for priors and decision weights (Not in the course notes)

3) In the transform node there is a rudimentary SAS code option where  you create a weight variable (based on prior probability values) an assign a role of frequency. That way you can adjust the posterior probability for priors.

Please note this weight option is different from survey design weights and SAS EM is not meant for using survey data. It is  recommend for building predictive models.

View solution in original post

1 REPLY 1
gcjfernandez
SAS Employee

The surrogate model example is providing a solution to assess the variable importance of the neural net (black box model). If you want to make decision or prediction use the posterior probability values derived from the NN model directly.  Because the posterior Probability reported in the surrogate decision tree model is not adjusted for over-sample or priors.

Therefore, in the course notes in case you need to use the posterior probabilities from the surrogate model, they provide the following solutions:

1) Use the scored data where event distribution reflects what is available in the reference population.

2) You could also use SAS code editor in EM and adjust for priors and decision weights (Not in the course notes)

3) In the transform node there is a rudimentary SAS code option where  you create a weight variable (based on prior probability values) an assign a role of frequency. That way you can adjust the posterior probability for priors.

Please note this weight option is different from survey design weights and SAS EM is not meant for using survey data. It is  recommend for building predictive models.

 

This is a knowledge-sharing community for learners in the Academy. Find answers to your questions or post here for a reply.
To ensure your success, use these getting-started resources:

Estimating Your Study Time
Reserving Software Lab Time
Most Commonly Asked Questions
Troubleshooting Your SAS-Hadoop Training Environment

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 289 views
  • 0 likes
  • 2 in conversation