Hi,
I want to tune the hyperparameters of a logistic regression model using a dataset that has a _partind_ column (train='1', validate='0', test='2') and using CAS capabilities.
I think PROC LOGSELECT is discarded since it has no AUTOTUNE statement such as PROC GRADBOOST even though it is CAS-enabled.
I'm currently trying the Autotune Action Set (as per the example in https://documentation.sas.com/doc/da/pgmsascdc/v_061/casactml/casactml_autotune_examples66.htm). However, documentation is not quite clear to me.
Question 1:
Given the following code (which is in the link above)
proc cas noqueue;
autotune.tuneLogistic /
trainOptions={
table={name='GETSTARTED'},
class={{vars={'C'}}},
model={
depVars={{name='y'}},
effects={
{vars={'C', 'x1', 'x2', 'x3', 'x4', 'x5',
'x6', 'x7', 'x8', 'x9', 'x10'}}
}
},
savestate={name="logistic_getstarted_model"}
}
tunerOptions={seed=12345}
/* Tuning Parameters
You do not need to specify any tuning parameters for the default
tuning process. If you want to make adjustments to the default
tuning process, uncomment the following block of code and change
any of the tuning parameters' attributes.
tuningParameters={
{name="method",
valueList={"BACKWARD", "FORWARD", "LASSO", "NONE", "STEPWISE"},
initValue="STEPWISE", exclude=false},
{name="slEntry", lb=0.01, ub=0.99, initValue=0.05, exclude=false},
{name="slStay", lb=0.01, ub=0.99, initValue=0.05, exclude=false},
{name="stopHorizon",lb=1, ub=5, initValue=3, exclude=false},
{name="lassoRho", lb=0.1, ub=0.9, initValue=0.8, exclude=false},
{name="lassoSteps", lb=10, ub=100, initValue=20, exclude=false}
}
*/
;
ods output TunerResults = TuneResults(keep=MisclassErr);
ods output EvaluationHistory = EvalHistory;
ods output IterationHistory = IterHistory;
run;
quit;
and considering the tuningParameters={{autotuneTuningParmDefinition-1},...} section of the documentation (https://documentation.sas.com/doc/da/pgmsascdc/v_061/casactml/cas-autotune-tunelogistic.htm#SAS.cas-...), where it is stated that each autotuneTuningParmDefinition-n has a namePath="string" (alias name) such that "it specifies the name path of a tuning parameter. For a nested action parameter, this parameter specifies a dot-separated path that includes all its parent parameter names. For a top-level action parameter, this parameter is simply the name of the parameter."
Is the user supposed to fully specify the parameter that is being referenced from the logistic action ??? If so, how does SAS Viya know that {name="method",...} in the example code above refers to the parameter hierarchy regression.logistic / selection={method="BACKWARD" | "ELASTICNET" | "FORWARD" | "LASSO" | "NONE" | "STEPWISE"} ??? (I assume this is the hyperparameter that was referenced in the example code because of the similarity in options; see https://documentation.sas.com/doc/en/pgmsascdc/v_061/casactstat/cas-regression-logistic.htm#SAS.cas-...).
Shouldn't {name="method",...} be {name="selection.method"} because there are multiple other "method" parameters such as regression.logistic / polynomial={{standarize={method="MOMENTS" | "MRANGE" | "WMOMENTS"}, ...}, ...} ???
Question 2:
As per the predefined train-validate-test partition of my dataset, I think I should use userDefinedPartition=TRUE|FALSE "when set to True, includes a user-defined partition for training and scoring." ( https://documentation.sas.com/doc/da/pgmsascdc/v_061/casactml/cas-autotune-tunelogistic.htm#SAS.cas-... ). But how ??? There is no reference on how to tell SAS what my partition column and values are.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.