@SteveDenham I ran my code on 3 variables over 250k random obs: PROC IMPORT OUT= vars DATAFILE= './summary.xls'
DBMS=xls REPLACE;
GETNAMES=YES;
RUN;
/* Use PROC SQL to create a macro variable (MissingVarList) that contains
the list of variables that have a property such as missing values */
RSUBMIT;
proc sql noprint;
select Variable,NLevels into :VarList separated by ' ', :nlevel
from vars
where NLevels = 3;
quit;
%put &=VarList;
/* 1. transpose from wide (Y, X1 ,...,X10000) to long (varNum VarName Y Value) */
data Long;
set proj.original; /* <== specify data set name HERE */
array x [*] &VarList; /* <== specify explanatory variables HERE */
do varNum = 1 to 3;
VarName = vname(x[varNum]); /* variable name in char var */
Value = x[varNum]; /* value for each variable for each obs */
output;
end;
drop &VarList;
run;
/* 2. Sort by BY-group variable */
proc sort data=Long; by VarName; run;
ods graphics off;
ods exclude all;
proc glm data=Long outest=PE;
CLASS Value;
by VarName;
model target = Value / RSQUARE;
quit;
ods exclude none;
/* Look at the results */
proc print data=PE(obs=5 keep=VarName Intercept Value _RSQ_);
run;quit; here's my log file after transpose (summary importing is working correctly): 31 /* 1. transpose from wide (Y, X1 ,...,X10000) to long (varNum VarName Y Value) */ 54 data Long; 55 set proj.original; /* <== specify data set name HERE */ 56 array x [*] &VarList; /* <== specify explanatory variables HERE */ 57 do varNum = 1 to 3; 58 VarName = vname(x[varNum]); /* variable name in char var */ 59 Value = x[varNum]; /* value for each variable for each obs */ 60 output; 61 end; 65 drop &VarList; 66 run; NOTE: There were 250000 observations read from the data set PROJ.ORIGINAL. NOTE: The data set WORK.LONG has 750000 observations and 10027 variables. NOTE: DATA statement used (Total process time): real time 17.61 seconds cpu time 17.61 seconds 68 69 /* 2. Sort by BY-group variable */ 70 proc sort data=Long; by VarName; run; NOTE: There were 750000 observations read from the data set WORK.LONG. NOTE: The data set WORK.LONG has 750000 observations and 10027 variables. NOTE: PROCEDURE SORT used (Total process time): real time 38.00 seconds cpu time 59.61 seconds 71 72 73 ods graphics off; 74 ods exclude all; 75 proc glm data=Long outest=PE; ------ 22 76 ERROR 22-322: Syntax error, expecting one of the following: ;, (, ALPHA, DATA, MANOVA, MULTIPASS, NAMELEN, NOPRINT, ORDER, OUTSTAT, PLOTS. ERROR 76-322: Syntax error, statement will be ignored. 76 CLASS Value; 77 by VarName; 78 model target = Value / RSQUARE; ------- 22 202 ERROR: No data set open to look up variables. NOTE: The previous statement has been deleted. ERROR 22-322: Syntax error, expecting one of the following: ;, ALIASING, ALPHA, CLI, CLM, CLPARM, COVBYCLASS, E, E1, E2, E3, E4, EST, I, INTERCEPT, INVERSE, NOINT, NOUNI, P, PREDICTED, SINGULAR, SOLUTION, SS1, SS2, SS3, SS4, TOLERANCE, X, XPX, ZETA. ERROR 202-322: The option or parameter is not recognized and will be ignored. 79 quit; NOTE: PROCEDURE GLM used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 80 ods exclude none; 81 91 92 /* Look at the results */ 93 proc print data=PE(obs=5 keep=VarName Intercept Value _RSQ_); ERROR: File WORK.PE.DATA does not exist. 94 run; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 94 ! quit; ID TARGET VAR1 ... VAR10000 VarNum VarName Value 1 yyy xx xx 1 Var1 var1_level1 2 yyy xx xx 2 Var2 var2_level1 3 yyy xx xx 3 Var3 var3_level1 4 yyy xx xx 1 Var1 var1_level2 5 yyy xx xx 2 Var2 var2_level2 6 yyy xx xx 3 Var3 var3_level1 7 yyy xx xx 1 Var1 var1_level1 8 yyy xx xx 2 Var2 var2_level2 9 yyy xx xx 3 Var3 var3_level2
... View more