SAS Data Science

Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Viya (Machine Learning), SAS Visual Text Analytics, with point-and-click interfaces or programming
BookmarkSubscribeRSS Feed
WilliamMunoz
Calcite | Level 5

Hi,

 

Let's say I have a cas table mycas.dataset with columns x1, x2, x3, x4, y, _partind_ such that x1-x4 are numerical predictors, y is a categorical (0, 1) target and _partind_ tags each row to one sample (1 for training, 0 for validation, 2 for testing).

 

My goal is to autotune a set of hyperparameters of PROC GRADBOOST (learningrate, maxdepth, vars_to_try) and get the most amount of modeling information (gini, instantiated models, ods tables, etc.) for each partition (_partind_).

 

After executing the following code

 

proc gradboost data=mycas.dataset seed=123;
partition role=_partind_(train="1" validate="0" test="2");
input x1 x2 x3 x4 / level=interval;
target y / level=nominal;
autotune
historytable=mycas.historytable
targetevent="1"
objective=gini
searchmethod=grid
useparameters=custom
tuningparameters=(
learningrate(values=0.01 0.1 0.5 init=0.01)
maxdepth(values=3 5 7 init=3)
) run;

I get a cas table mycas.historytable with columns evaluation, iteration, LEARNINGRATE, MAXLEVEL, GiniCoefficient, EvalType.

 

My questions are

1. Is there any documentation about the mapping between the selected hyperparameters and the names these will be given in the mycas.historytable table? I mean I can guess that LEARNINGRATE, MAXLEVEL map to learningrate, maxdepth respectively by inspecting the input values; however, I've noticed SAS sometimes returns names and sometimes descriptions. Consider the ods table ModelInfo; here the mapping is not that explicit.

2. Is the column above GiniCoefficient computed on the training, validation, testing o whole dataset?

3. How can I add a GiniCoefficient_training, GiniCoefficient_validation, GiniCoefficient_testing to the mycas.historytable table?

4. How can I save all the models that were trained with each configuration of hyperparameters?

5. If I add the following statements

output out=mycas.dataset_scored;
ods output some_ods_table=mycas.some_ods_table;

inside the PROC GRADBOOST, how do I know what evaluation (evaluation column in mycas.historytable) is being referred to? Is it the one with the max GiniCoefficient? Can I save the output and ods output that correspond to each hyperparameter configuration?

6. In the EvalType column in mycas.historytable, I've seen at least two possible values: EVALUATED and LOOKED_UP. What does each of these mean?

7. Once I get the "best" set of hyperparameters (say learningrate=0.5, maxdepth=3) and I modify the code above as follows

proc gradboost data=mycas.dataset seed=123 learningrate=0.5 maxdepth=3;
partition role=_partind_(train="1" validate="0" test="2");
input x1 x2 x3 x4 / level=interval;
target y / level=nominal
run;

Will this code give me the exact same result as the code with AUTOTUNE with the configuration learningrate=0.5 maxdepth=3? How can I get gini from this proc alone without needing another proc (for example, proc assess)?

 

 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 381 views
  • 0 likes
  • 1 in conversation