- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Let's say I have a cas table mycas.dataset with columns x1, x2, x3, x4, y, _partind_ such that x1-x4 are numerical predictors, y is a categorical (0, 1) target and _partind_ tags each row to one sample (1 for training, 0 for validation, 2 for testing).
My goal is to autotune a set of hyperparameters of PROC GRADBOOST (learningrate, maxdepth, vars_to_try) and get the most amount of modeling information (gini, instantiated models, ods tables, etc.) for each partition (_partind_).
After executing the following code
proc gradboost data=mycas.dataset seed=123;
partition role=_partind_(train="1" validate="0" test="2");
input x1 x2 x3 x4 / level=interval;
target y / level=nominal;
autotune
historytable=mycas.historytable
targetevent="1"
objective=gini
searchmethod=grid
useparameters=custom
tuningparameters=(
learningrate(values=0.01 0.1 0.5 init=0.01)
maxdepth(values=3 5 7 init=3)
)
run;
I get a cas table mycas.historytable with columns evaluation, iteration, LEARNINGRATE, MAXLEVEL, GiniCoefficient, EvalType.
My questions are
1. Is there any documentation about the mapping between the selected hyperparameters and the names these will be given in the mycas.historytable table? I mean I can guess that LEARNINGRATE, MAXLEVEL map to learningrate, maxdepth respectively by inspecting the input values; however, I've noticed SAS sometimes returns names and sometimes descriptions. Consider the ods table ModelInfo; here the mapping is not that explicit.
2. Is the column above GiniCoefficient computed on the training, validation, testing o whole dataset?
3. How can I add a GiniCoefficient_training, GiniCoefficient_validation, GiniCoefficient_testing to the mycas.historytable table?
4. How can I save all the models that were trained with each configuration of hyperparameters?
5. If I add the following statements
output out=mycas.dataset_scored;
ods output some_ods_table=mycas.some_ods_table;
inside the PROC GRADBOOST, how do I know what evaluation (evaluation column in mycas.historytable) is being referred to? Is it the one with the max GiniCoefficient? Can I save the output and ods output that correspond to each hyperparameter configuration?
6. In the EvalType column in mycas.historytable, I've seen at least two possible values: EVALUATED and LOOKED_UP. What does each of these mean?
7. Once I get the "best" set of hyperparameters (say learningrate=0.5, maxdepth=3) and I modify the code above as follows
proc gradboost data=mycas.dataset seed=123 learningrate=0.5 maxdepth=3; partition role=_partind_(train="1" validate="0" test="2"); input x1 x2 x3 x4 / level=interval; target y / level=nominal run;
Will this code give me the exact same result as the code with AUTOTUNE with the configuration learningrate=0.5 maxdepth=3? How can I get gini from this proc alone without needing another proc (for example, proc assess)?