Hi guys,
My task is very simple. I'm using PROC LOGISTIC with the argument ctable and pprob. I need to use this inside a macro and vary the values of pprobs from 0.01 to 1. Outside the macro, the procedure (if I pick a value for i, say 5) works perfectly. However, inside the macro the following error is displayed: ERROR: (execution) Matrix has not been set to a value.
I've attached code (in SAS Studio) and data. The part that does not run is the very last (the logistic_macro).
%macro logistic_macro;
%do i = 1 %to 10;
proc logistic data=WORK.GLMDesign outmodel=final_model noprint;
class Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28 / param=glm;
model VAR21(event='1')= Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28
Col2 / link=logit technique=fisher ctable pprob= %SYSEVALF(&i./10) ;
output out=preds predprobs=individual;
score data=testset out=TestPred_new;
run;
proc freq data=TestPred_new;
table VAR21*I_VAR21 / out=CellCounts_new noprint;
run;
proc iml;
loss= j(10, 1, 0); /*vector to store loss for different thresholds */
use work.CellCounts_new;
read all var _ALL_ into M;
close work.CellCounts_new;
FalsePos = M[2,2] # 5;
FalseNeg = M[3,2];
loss[i] = FalsePos + FalseNeg;
print loss;
%end;
%mend;
%logistic_macro; *runs the macro;
Any ideas?
I appreciate your help and comments!
Should it be :
loss[&i] = FalsePos + FalseNeg;
since IML is currently looking for the matrix i that does not exist.
Should it be :
loss[&i] = FalsePos + FalseNeg;
since IML is currently looking for the matrix i that does not exist.
In your code, you are computing a scalar for each value of &i. The LOSS vector will be all zeros except for the &i_th element. Might as well define LOSS as a scalar, since that vector is created/destroyed between PROC IML calls and it does not persist accross invocations.
Rick,
It's me again haha!
Well, I want to ask your help because I've tried (a lot) of different approaches to declare a vector out of the 'DO' loop and then run proc logistic and another proc IML filling that vector, but I always get the same error. Do you know how can I do that in this context?
Here's what I've tried so far:
## Approach 1 - with proc IML ###
%macro logistic_macro;
proc iml;
loss= j(10, 1, 0); /*vector to store loss for different thresholds */
%do i = 1 %to 10;
proc logistic data=WORK.GLMDesign outmodel=final_model noprint;
class Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28 / param=glm;
model VAR21(event='1')= Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28
Col2 / link=logit technique=fisher ctable pprob= %SYSEVALF(&i./10) ;
output out=preds predprobs=individual;
score data=testset out=TestPred_new;
run;
proc freq data=TestPred_new;
table VAR21*I_VAR21 / out=CellCounts_new noprint;
run;
proc iml;
use work.CellCounts_new;
read all var _ALL_ into M;
close work.CellCounts_new;
FalsePos = M[2,2] # 5;
FalseNeg = M[3,2];
loss[&i] = FalsePos + FalseNeg;
print loss;
%end;
%mend;
%logistic_macro; *runs the macro;
###Approach 2 - creating an array without proc iml;
%macro logistic_macro;
%do i = 1 %to 10;
proc logistic data=WORK.GLMDesign outmodel=final_model noprint;
class Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28 / param=glm;
model VAR21(event='1')= Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28
Col2 / link=logit technique=fisher ctable pprob= %SYSEVALF(&i./10) ;
output out=preds predprobs=individual;
score data=testset out=TestPred_new;
run;
proc freq data=TestPred_new;
table VAR21*I_VAR21 / out=CellCounts_new noprint;
run;
data FalsePos FalseNeg;
set CellCounts_new;
keep COUNT ;
if _N_ =2 then output FalsePos;
if _N_ =3 then output FalseNeg;
loss = FalsePos*5 + FalseNeg;
run;
data loss_vec;
set loss_vec loss;
call symputx(cat('Loss',%SYSEVALF(&i)),loss);
run;
%end;
%mend;
data loss_vec;
input Loss1-Loss10;
datalines; *allows to create data as belows;
0 0 0 0 0 0 0 0 0 0
; *must end with isolated semicolon;
%logistic_macro; *runs the macro;
None of them works (and I still need to understand why assigning different values for pprob in proc logistic is not making changes to my code, but let's treat that afterwards).
I'd really appreciate your help here, I've been stuck for hours =/
Approach #1 is not working as the first time PROC LOGISTIC is run, it terminates IML completely and the matrix 'loss' is destroyed.
There are two possible ways I can think of that will let you mix the PROCs with IML.
(1) Use a macro loop around LOGISTIC and FREQ that generates a series of output data sets called CellCounts_new1, CellCounts_new2, .. , CellCounts_new10. Then read this series of data sets using a loop in IML and put together the vector of losses.
(2) Investigate the SUBMIT / ENDSUBMIT block that you use to run LOGISTIC and FREQ from within IML. This is probably the best solution since there would be a single IML loop which submits each block of code and reads the output data set from PROC FREQ as you have already been doing.
(3) Get rid of the macro loop and use BY-group processing. The BY group approach will create a single data set that contains the results for all 10 logistic models. You can then process the results by using a single call to PROC IML.
Hi Rick,
Thanks a lot! I decided to follow your approach because I saw it was faster.
I was able to successfully create the merged dataset by a new index variable i. However, I wasn't able to use this value of i inside my proc logistic function when using the argument pprob= %SYSEVALF(i./10).
It's returning the "ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: i/10"
I tried without SYSEVALF and also no success.
I guess it's just a simple fix, but I could not find anywhere =/
Here's my code now:
%let N = 10;
data GLMdesign_by;
set GLMDesign;
do i = 1 to &N;
output;
end;
run;
proc sort data=GLMdesign_by; by i; run;
/* Call PROC LOGISTIC and use BY statement to compute all models */
proc logistic data=GLMdesign_by outmodel=final_model noprint;
by i;
class Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28 / param=glm;
model VAR21(event='1')= Col3 Col4 Col7 Col8 Col9 Col13 Col22 Col23 Col27 Col28
Col2 / link=logit technique=fisher ctable pprob= %SYSEVALF(i./10) ;
output out=preds predprobs=individual;
score data=testset out=TestPred_new;
run;
I'm not sure what you are trying to accomplish, but in your syntax 'i' is the name of a variable. The PPROBS= option requires a value.such as PPROB=(0.1 to 0.9 by 0.1).
There is no support to have the cutpoints be data-dependent.
If you need the cutpoints to depend on the value of the BY group, all tables and output data sets contain the BY-group variable, so you can readily use that value to process the output.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.