Hi, I'm working on a prediction problem where the target variable can take hundreds of values. My objective is not to be able to exactly predict the target variable...that would be too difficult! What I'm trying to do is to create a 'top 5' of the most likely targets. My current approach is to create as many binary models as there are values the target variable can take. So for example, if the target variable can be 'a', 'b' or 'c', I would create the following 3 models: Model 1: Predict 'a' vs 'non-a' Model 2: Predict 'b' vs 'non-b' Model 3: Predict 'c' vs 'non-c' Except, I'm doing hundreds of them. Once I have my models, I score the data using each one of them. I then rank the scores from highest to lowest, and keep the top 5. So far, so good! I got that to work fine with the code below: %macro m1 (); %local i next ; %let i=1; %do i=1 %to &clust_nb.; data _null_; set Training.ref_clust&i; call execute(' proc hpforest data=Training.training_clust'||STRIP(&i)||' VARS_TO_TRY=40; //my input variables target target'||STRIP(TARGET_CODE)||'/level=binary; ODS output VariableImportance=VARIMP.VARIMP_CLUST'||STRIP(&i)||'_Target'||STRIP(TARGET_CODE)||'; save FILE=''/path/Cluster'||STRIP(&i)||'_target'||STRIP(TARGET_CODE)||'''; run; '); run; %end; %mend m1; %m1(); I have over 500 input variables. However, I know that for each of the model, only 10-30 are relevant (and these 10-30 relevant input variables are different for each model, which explains why I start with 500 variables.) Here's what I would like to do: For each of my hpforest, I would like to identify the few variables that are relevant for a given target value. Instead of training my hundreds of models on 500 input variables, I would be training each one of them on just the relevant variable. I would basically like to apply the SAS EM 'Variable Selection' node before running each of my model...but in my SAS Base loop above. I'm still very new to SAS and I'm having a hard time thinking of how I could do this efficiently. Anyone has suggestions? Thanks!
... View more