Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Export scoring code for Cross Validation in SAS Enterprise Miner

Reply
New Contributor
Posts: 4

Export scoring code for Cross Validation in SAS Enterprise Miner

I have used start groups and end groups nodes to perform 5-fold cross validation on a modelling node in SAS EM, grouping on a random variable in my training data which I created for this purpose.  I now wish to use the model I have created to score up a new dataset.

 

When I export the scoring code I can see that it is referencing the random variable that I created for the purpose of cross-validation in the scoring code, but this variable is not present in my new data as it was only created for the purpose of the cross-validation.  Unless I am mis-reading the code it appears to use the value of the random variable to score each of the 5 segments of the data differently.  The datasets which I am scoring in the live environment could be fairly small (only a fewthousand records at a time so I don't feel that this would be appropriate)

 

How do I apply the scoring code to my new data so that every observation is scored consistently?

Super User
Posts: 19,768

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Posted in reply to DavidWilson

When you build your model wouldn't that variable have been excluded? 

Im not sure how you used it in CV.

New Contributor
Posts: 4

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

I have created the random variable (called fold), given it values 1-5 and assigned it to the role segment as advised in the answer by m_maldonado to the question (https://communities.sas.com/t5/SAS-Data-Mining/Using-cross-validation-in-Enterprise-Miner/m-p/233635...) link in brackets.  I have then used start groups and end groups to perform the cross validation.

 

The random variable does not appear in the model as a predictor but in the scoring code each of the 5 segments is scored differently according to which fold they are in.  I don't see how to apply this to a new dataset unless I also create the random variable on my new data wich does not seem to make sense.

 

Super User
Posts: 19,768

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Posted in reply to DavidWilson

Cross validation is used to verify results. Definitely shouldn't have different models for each segment. 

Shouldnt the scoring code you use be from steps before the cross validation? 

Super User
Posts: 19,768

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

You should also wait for an answer from Miguel or someone else my EM skills have gotten really rusty 🙁

New Contributor
Posts: 4

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

No problem. Many thanks for trying to help!
Super Contributor
Posts: 337

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Posted in reply to DavidWilson

Hi David,

Sorry I am late to the party.

I don't have EM handy. Sadly I spend more time in meetings than on hands-on software these days.

This is the kind of thing that I would suggest fixing directly on the score code while someone figures out the right way to do this.

 

If you have a chance, post the score code of the flow you have (the simpler the data the better), and the community and myself will give you suggestions!

 

Best,

-M

New Contributor
Posts: 4

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Posted in reply to M_Maldonado

Thanks Miguel

 

I have created my own workaround to this by taking the score code generated and adapting it to score my whole dataset 5 times (once for each fold) and then calculating the average of the predicted probabilities from each model on each observation, which if I am understanding the method correctly from my reading is what is required.

 

I'll try to create an example version of what I have done with some standard data so that I can post the score code - what would you need, an xml of the diagram?

 

Meantime thanks for your help

 

Super Contributor
Posts: 337

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Posted in reply to DavidWilson

I like that workaround!

XML of the diagram or a quick screenshot, or both Smiley Happy

When you have a chance, I am also very curious to know more about your learnings about cross validation. In particular, do you feel like you get more predictive power, or anything else you might share?

 

Thanks!

Ask a Question
Discussion stats
  • 8 replies
  • 642 views
  • 0 likes
  • 3 in conversation