NOTE: This session is in interactive mode. 1 /*---- mldmbd03d03_Randomwoods_SMP.sas ----*/ 2 3 /*------------------------------------------------------------*/ 4 /*---- Prerequisites: ----*/ 5 /*---- 1. SAS table converted to HDFS table using the ----*/ 6 /*---- SASHDAT format. ----*/ 7 /*---- 2. LASR Server started. ----*/ 8 /*---- 3. SAS HDFS table loaded into LASR memory. ----*/ 9 /*------------------------------------------------------------*/ 10 11 /*---- Include course macro variable definition file ----*/ 12 /*---- when using SAS Studio in interactive mode. When ----*/ 13 /*---- using SAS Studio in non-interactive mode or SAS ----*/ 14 /*---- Enterprise Guide or the SAS Windowing Environment,----*/ 15 /*---- macro variables will persist, so the definition ----*/ 16 /*---- file need only be run once. ----*/ 17 18 /*---- BEGIN: Macro definitions ----*/ 19 20 %include "D:\workshop\MLDMBD\mldmbd_Macros_SMP.sas"; ====================================== Configuration SASDataFolder=D:\workshop\MLDMBD\Data ScoreFolder=D:\workshop\MLDMBD\output LASRpath=C:\Temp ServerName=SASBAP.demo.sas.com SessionPort=10011 host=SASBAP.demo.sas.com ====================================== NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds NOTE: Libref SASLIB was successfully assigned as follows: Engine: V9 Physical Name: D:\workshop\MLDMBD\Data Directory Libref SASLIB Engine V9 Physical Name D:\workshop\MLDMBD\Data Filename D:\workshop\MLDMBD\Data Member # Name Type File Size Last Modified 1 PVA97NK DATA 22MB 02/13/2013 16:02:24 2 PVA97NK_OLD DATA 19MB 03/03/2008 15:06:36 3 P_MODEL_BANK13 DATA 448MB 01/22/2015 14:41:02 4 VALIDSTATS DATA 128KB 09/21/2018 04:23:26 5 VS_BANK DATA 430MB 01/22/2015 14:41:01 6 VS_BANK250K DATA 110MB 11/03/2015 14:43:51 7 VS_BANK500K DATA 219MB 11/03/2015 14:45:31 64 65 /*---- END: Macro definitions ----*/ 66 67 libname LASRlib sasiola port=&SessionPort host="&host" tag="&TagString"; NOTE: Libref LASRLIB was successfully assigned as follows: Engine: SASIOLA Physical Name: SAS LASR Analytic Server engine on local host, port 10011 68 69 /*---- Define macro variables for analysis ----*/ 70 /*---- NOTE: While these macro variables may have ----*/ 71 /*---- been created by a previous program, ----*/ 72 /*---- we need to ensure that the macro ----*/ 73 /*---- variables persist, and the easy ----*/ 74 /*---- solution is just to redefine them. ----*/ 75 76 %let rfms=rfm1 rfm2 rfm3 rfm4 rfm5 rfm6 rfm7 rfm8 rfm9 rfm10 77 rfm11 rfm12; 78 %let i_rfms=i_rfm1 i_rfm2 i_rfm3 i_rfm4 i_rfm5 i_rfm6 i_rfm7 79 i_rfm8 i_rfm9 i_rfm10 i_rfm11 i_rfm12; 80 %let logi_rfms=logi_rfm1 logi_rfm2 logi_rfm3 logi_rfm4 81 logi_rfm5 logi_rfm6 logi_rfm7 logi_rfm8 82 logi_rfm9 logi_rfm10 logi_rfm11 logi_rfm12; 83 %let r_demogs=demog_age demog_ho r_demog_homeval r_demog_inc 84 demog_pr demog_genf demog_genm ; 85 %let catvars=cat_input1 cat_input2; 86 87 /*---- NOTE: RANDOMWOODS can be run on the data ----*/ 88 /*---- before imputation because the algorithm ----*/ 89 /*---- handles missing data. Consequently, use ----*/ 90 /*---- as inputs variables that have been ----*/ 91 /*---- recoded but not imputed. ----*/ 92 /*---- NOTE: RANDOMWOODS default settings are not ----*/ 93 /*---- recommended. They are selected to promote ----*/ 94 /*---- computational performance rather than ----*/ 95 /*---- prediction accuracy. In fact, using the ----*/ 96 /*---- defaults may invalidate some of Leo ----*/ 97 /*---- Brieman's claims about random forests, ----*/ 98 /*---- for example, that random forests cannot ----*/ 99 /*---- overfit the data. ----*/ 100 NOTE: PROCEDURE DATASETS used (Total process time): real time 0.86 seconds cpu time 0.11 seconds 101 proc imstat; 102 table LASRlib.p_model_bank13(tag="&TagString"); NOTE: The table LASRLIB.P_MODEL_BANK13 does not exist in the SAS LASR Analytic Server on host 'localhost', port 10011. ERROR: File LASRLIB.P_MODEL_BANK13.DATA does not exist. 103 where _PartInd_=1; WARNING: No data sets qualify for WHERE processing. 104 randomwoods 105 /*---- Define target and inputs and input measurement levels ----*/ 106 b_tgt / nominal=(&catvars) input=(&rfms &r_demogs &catvars) ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. 107 108 /*---- Forest options ----*/ 109 assess 110 event=("1") 111 ntree=51 /*---- Number of trees in the forest. ----*/ 112 /*---- Default=1, more a savanna than a forest. ----*/ 113 /*---- Bad default; bad, bad default, no ----*/ 114 /*---- biscuit for you! ----*/ 115 seed=12345 116 bootstrap=0.632121 /*---- = default=1-exp(-1) ----*/ 117 118 /*---- Individual tree options ----*/ 119 /*---- Gain and greedy control algorithm, ----*/ 120 /*---- default is information gain ratio ----*/ 121 122 leafsize=100 /*---- DECISIONTREE Default=5 ----*/ 123 maxbranch=2 124 maxlevel=6 125 m=5 /*---- Number of inputs to use for each tree. A ----*/ 126 /*---- smaller number may be used if one or more ----*/ 127 /*---- of the M chosen variables does not ----*/ 128 /*---- exhibit information gain with respect to ----*/ 129 /*---- the target variable. ----*/ 130 /*---- Default m=ceil(sqrt(Number Inputs)) ----*/ 131 nbins=10 /*---- Number of bins for binning numeric ----*/ 132 /*---- inputs. Default=2 (Bad default!?!) ----*/ 133 /*---- These are classification trees. There are ----*/ 134 /*---- additional options for regression trees, ----*/ 135 /*---- which are trees for numeric targets. ----*/ 136 137 /*---- Results ----*/ 138 treeinfo 139 save=RW1 140 /*---- scoredata=LASRlib.p_model_bank13 ----*/ 141 code=(filename="&ScoreFolder/RWScoreCode.sas" replace) 142 temptable; 143 run; 144 145 quit; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE IMSTAT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds 146 147 filename RWscore "&ScoreFolder/RWScoreCode.sas"; 148 149 proc imstat; 150 /*---- Scoring with the RW score code ----*/ 151 table LASRlib.p_model_bank13(tag="&TagString"); NOTE: The table LASRLIB.P_MODEL_BANK13 does not exist in the SAS LASR Analytic Server on host 'localhost', port 10011. ERROR: File LASRLIB.P_MODEL_BANK13.DATA does not exist. 152 where _PartInd_=2; WARNING: No data sets qualify for WHERE processing. 153 score code=RWscore 154 keep=(_ALL_) /*---- Score data will have all vars ----*/ ERROR: No data set open to look up variables. 155 out=LASRlib.ScoredbyRW; /*---- Save scored data in LASR ----*/ 156 run; 157 158 table LASRlib.ScoredbyRW; NOTE: The table LASRLIB.SCOREDBYRW does not exist in the SAS LASR Analytic Server on host 'localhost', port 10011. ERROR: File LASRLIB.SCOREDBYRW.DATA does not exist. 159 columninfo; 160 fetch / from=1 to=20; 161 run; NOTE: Statements are ignored in this RUN block because of parsing errors. 162 163 quit; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE IMSTAT used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 164 165 /*---- Explore scored data ----*/ 166 proc imstat; 167 table LASRlib.ScoredbyRW; NOTE: The table LASRLIB.SCOREDBYRW does not exist in the SAS LASR Analytic Server on host 'localhost', port 10011. ERROR: File LASRLIB.SCOREDBYRW.DATA does not exist. 168 columninfo; 169 run; NOTE: Statements are ignored in this RUN block because of parsing errors. 170 fetch b_tgt RF_b_tgt _vote_/ from=1 to=5; ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. 171 run; 172 frequency b_tgt: RF_b_tgt; ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. 173 run; 174 crosstab b_tgt*RF_b_tgt; ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. 175 run; 176 crosstab RF_b_tgt*_vote_; ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. 177 run; 178 crosstab b_tgt*_vote_; ERROR: No data set open to look up variables. ERROR: No data set open to look up variables. 179 run; 180 181 quit; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE IMSTAT used (Total process time): real time 0.00 seconds cpu time 0.01 seconds 182