Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

How to use the scoring code from gradient boosting?

Accepted Solution Solved
Reply
Regular Contributor
Posts: 190
Accepted Solution

How to use the scoring code from gradient boosting?

I ran gradient boosting in EM and want to use the attached scoring code to the new dataset. How can I use it for scoring, as its written without setting a dataset and even some variables used are not present.


Accepted Solutions
Solution
‎04-13-2016 01:27 PM
SAS Super FREQ
Posts: 306

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

Just like with all data step score code from EM:

data /* name of data set containing scores */;
  set /* data that you want to score */;
  %inc "/app/sasdata/EBI_ADVANL/EM_Projects/churn/Workspaces/EMWS1/Boost/EMPUBLISHSCORE.sas " /* or paste in the score code */;
run;

View solution in original post


All Replies
Super User
Posts: 19,850

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

You can't easily. How are you getting data in without a SET statement?

Solution
‎04-13-2016 01:27 PM
SAS Super FREQ
Posts: 306

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

Just like with all data step score code from EM:

data /* name of data set containing scores */;
  set /* data that you want to score */;
  %inc "/app/sasdata/EBI_ADVANL/EM_Projects/churn/Workspaces/EMWS1/Boost/EMPUBLISHSCORE.sas " /* or paste in the score code */;
run;
Regular Contributor
Posts: 190

Re: How to use the scoring code from gradient boosting?

Posted in reply to WendyCzika

But my dataset, does not have some variables, which are there in code. I am not sure, how this code is generated.

Super User
Posts: 19,850

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

If you built a model that requires certain variables and you want to score with the same model you need those variables. If you don't have those variables, then either remove them and rebuild your model, OR change your model so it can handle missing values by including missing values for that variable in the training data.

 

 

Regular Contributor
Posts: 190

Re: How to use the scoring code from gradient boosting?

Well, my input dataset does not have some of these variables like those starting with "_". but they are in scoring code. So that is what I am confused about.
Super User
Posts: 19,850

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

Did you build your model?

In the process did you create any new variables? SAS may have automatically named them. SAS may also be creating automatic variables required in your model, for intermediate steps.

 

Your input dataset needs to match the structure of your training data set. Same variables, same names, same types and same levels for categorical data. 

 

My suggestion would be to try and see what happens. 

 

@WendyCzika has shown the correct way to score a new dataset. 

 

Regular Contributor
Posts: 190

Re: How to use the scoring code from gradient boosting?

I had applied one oversampling step. But i think that should affect only the target variable and nothing new else
SAS Super FREQ
Posts: 306

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

I'm guessing the _ variables that you mean, if they are not input variables, are created BY the scoring code - they don't need to be in the data you are scoring, so it should be fine.

Regular Contributor
Posts: 190

Re: How to use the scoring code from gradient boosting?

Posted in reply to WendyCzika
If you look at top of the code, you notice:


********** LEAF 1 NODE 2467 ***************;
IF _ARB_BADF_ EQ 0 THEN DO;

will it not throw the error?
SAS Super FREQ
Posts: 306

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u

It's defined above that:

_ARB_BADF_ = 0;

Regular Contributor
Posts: 190

Re: How to use the scoring code from gradient boosting?

Posted in reply to WendyCzika
Oh, I overlooked that. Seems like I am good to go then. Thanks!
SAS Employee
Posts: 122

Re: How to use the scoring code from gradient boosting?

Posted in reply to munitech4u
Hi, When we build EM flows, it actually writes out SAS code in the background behind each (most) node. When a model is built and scoring code is built, it typically retains all the analytically relevant pre-codes leading towards the final scoring equation. For example, if a variable transformation node is involved in the flow, all the transformations are automatically retained including all the renames, what-if... There is also a separate SAS Score Code Node that helps. One may ask: I transformed 800 variables + derived 1000 variables, but only used 12 in the final model. Does the scoring have all of them? No, the scoring code only contains what survives into the final model. If you see score code and optimized score code, make sure you pick the optimized one. Exception to that is: if you insert your custom code by using SAS Code node, they are not automatically copied over. Also this 'way' may not work for some methods like random forest. But I recall gradient boosting is fine. Over the years I have seen EM users going back to transformation node to pick up the code behind the scene, study and improve their SAS programming that way. Although conventions like _ appear a bit odd, coding there is 'best'. Enjoy. Jason Xin
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 1044 views
  • 2 likes
  • 4 in conversation