My previous tip on cross validation shows how to compare three trained models (regression, random forest, and gradient boosting) based on their 5-fold cross validation training errors in SAS Enterprise Miner. This tip is the second installment about using cross validation in SAS Enterprise Miner and builds on the diagram that is used in the first tip.
In addition to comparing models based on their 5-fold cross validation training errors, this tip also shows how to obtain a 5-fold cross validation testing error; so it provides a more complete SAS Enterprise Miner flow (shown below).
First a quick note about how k-fold cross validation training and testing errors are calculated:
Following is a step-by-step explanation of the preceding Enterprise Miner flow. You can run this process flow by using the attached xml file.
data temptest; set &EM_import_TEST; ID = _N_; run; data &EM_EXPORT_TEST; set temptest (in=in1) temptest (in=in2) temptest (in=in3) temptest (in=in4) temptest(in=in5); if in1 then _fold_= 1; else if in2 then _fold_=2; else if in3 then _fold_=3; else if in4 then _fold_=4; else if in5 then _fold_=5; run;
proc sort data=&EM_IMPORT_TEST out=&EM_EXPORT_TEST; by ID; run; data &EM_EXPORT_TRAIN; set &EM_IMPORT_DATA; run; data test1 test2 test3 test4 test5; set &EM_IMPORT_TEST; if _fold_ = 1 then output test1; if _fold_ = 2 then output test2; if _fold_ = 3 then output test3; if _fold_ = 4 then output test4; if _fold_ = 5 then output test5; run; data &EM_EXPORT_TEST; merge test1(rename=(P_BAD1 = P_BAD_1)) test2(rename=(P_BAD1 =P_BAD_2)) test3(rename=(P_BAD1 = P_BAD_3)) test4(rename=(P_BAD1 = P_BAD_4)) test5(rename=(P_BAD1 = P_BAD_5)); by ID; P_BAD1 = (P_BAD_1 + P_BAD_2 + P_BAD_3 + P_BAD_4+P_BAD_5)/5; run;
Note that you can obtain the cross validated predictions of the test set by saving the exported data (&EM_EXPORT_TEST) of the SAS Code 2 node.
If you run this flow diagram or replicate this analysis for your own data, make sure that you run each Start Groups/End Groups block separately, because multiple looping actions do not work at the same time.
Thanks a lot to Ralph Abbey for his help in putting this together.