Hi SAS/IML users:
I have a very simply but weird question:
I simulate some dataset, and fit it to a linear mixed model using "MIXED".
I use the parameter estimates to calculate some vectors, and my final results are two column vectors, which are obtained using "probnorm" function in IML.
Then I select some of the values from each vector, based on certain criteria.
I noticed that some values in the final vectors "edg_t" and "noedg_t" equal to 1 (or 0 in some cases), which is caused by extremely large (or small in some cases) values when fitted to "probnorm" function. So I want to exclude those rows whose values are 1 (or 0) by using "loc" function. In this example, there is no 0, so the purpose is to choose rows with value of 1 only.
Somehow, this does not work.
It either only selects 1 row, when there are multiple 1's in the vector (when using other seeds values), or it just reports error message, saying the matrix is not set to a value.
But when I copy the vector that contains 1, as shown in the code and use "loc" again, "loc" function works.
Could anyone help me with this weird situation, please? why it does not work here for "noedg_t"?
You can change seed values such that there are multiple 1's there, but "loc" does not work.
proc iml;
CALL RANDSEED (10);
n=6; SampSize=100; target = 7; nfe = 4; nre = 3;
/*TIMEPOINT = 1:4;*/
TIMEPOINT = 4;
/*binary x_i*/
x_EDG = randfun(SampSize, "Bernoulli", 0.5);
print x_EDG;
EDG = colvec(repeat(x_EDG,1,n));
print EDG;
/*fixed effect*/
beta = {21.4,1.92,-3.97,0.350};
time_i = {0,0,1,2,3,4};
timesqr_i = time_i##2;
print timesqr_i;
time = shape(repeat(time_i,SampSize),SampSize*n,1);
timesqr = shape(repeat(timesqr_i,SampSize),SampSize*n,1) ;
intcp = j(SampSize*n,1,1);
X = intcp || EDG || time || timesqr;
X_design = X;
/*print X;*/
Zi = repeat(1,n)||time_i||timesqr_i;
/*Zi = repeat(1,n)||time_i; */
Z = I(SampSize)@Zi;
Z_design = Z;
/*print Z;*/
/*error term*/
vc = 10*I(n);
/*print vc;*/
Mean= repeat(0,n);
eps = RandNormal(SampSize, Mean, vc);
eps = shape(eps,SampSize*n,1);
/*print eps;*/
mu = {0 0 0};
cov = {10.4 0.279 -0.341,0.279 13.06 -2.466, -0.341 -2.466 0.581};
/* rnd = RANDNORMAL(SampSize, mu, cov);*/
rnd = RandMVT(SampSize, 3, mu, cov );
rndA = shape(rnd, SampSize*3,1);
Y = X*beta+Z*rndA+eps;
/*print Y;*/
/*Create ID vectors*/
ID = colvec(repeat(T(1:SampSize),1,n)) ;
/*print ID;*/
dataset = ID||Y||EDG||time||timesqr;
print dataset;
create MyData from dataset [colname={"id", "Y", "EDG", "time"}];
append from dataset;
close MyData;
submit MyData;
ods listing close;
ods output SolutionF = fixed SolutionR = rnds CovParms=CovP;
proc mixed data=MyData method=ml COVTEST;
class id;
model Y = EDG time time*time / solution cl covb;
random intercept time time*time / SUB=id TYPE=un SOLUTION G GCORR V;
run;
ods listing;
endsubmit;
use rnds;
read all into rnds[colname=varNames];
intercept_idx= do(1,SampSize*nre,3);
/*print intercept_idx;*/
time_idx = do(2,SampSize*nre,3);
/*print time_idx;*/
timesqr_idx = do(3,SampSize*nre,3);
/*print timesqr_idx;*/
EB_intercept= rnds[intercept_idx,"Estimate"];
EB_time = rnds[time_idx,"Estimate"];
EB_timesqr = rnds[timesqr_idx,"Estimate"];
print EB_intercept,EB_time,EB_timesqr;
use Fixed;
read all var {Estimate} into EB_FIXED;
use CovP;
read all var {Estimate} into CovPars;
resid_var = CovPars[nrow(CovPars)];
EB_RND = EB_intercept||EB_time||EB_timesqr;
shape_EB_RND = shape(EB_RND,SampSize*nre,1);
mu = X_design*EB_FIXED + Z_design*shape_EB_RND;
print mu;
Benefit = probnorm( (target - (X_design*EB_FIXED + Z_design*shape_EB_RND)) / sqrt(resid_var) ) ;
Benefit_all = id || EDG || time || Benefit;
print Benefit_all;
/**********************************************************************************************************************/
/**********************************GROUP THE BENEFITS by x_i and at TIMEPOINT******************************************/
edg_rows_t = loc((Benefit_all[,2]=1) & (Benefit_all[,3]=TIMEPOINT));
noedg_rows_t = loc((Benefit_all[,2]=0) & (Benefit_all[,3]=TIMEPOINT));
/*obtain EB benefits at TIMEPOINT*/
edg_t = Benefit_all[edg_rows_t,ncol(Benefit_all)];
noedg_t = Benefit_all[noedg_rows_t,ncol(Benefit_all)];
print edg_t;
print noedg_t;
/**********************************************************************************************************************/
/*loc failed to work here*/
edg_ext = loc((edg_t=0)|(edg_t=1));
noedg_ext = loc((noedg_t=0)|(noedg_t=1));
print edg_ext;
print noedg_ext;
/*this vectors is a section of "noedg_t" that contains a row whoe value is 1, loc works here, but not in the above section.*/
a = {0.8975011
0.2758047
9.106E-17
2.268E-8
0.0010268
1
0.7845039
0.0003516
};
b=loc(a=1); print b;
I really appreciate your help.
Even though a value prints as 1, it might be different than 1. For example, it might be 0.99999999999999.
If x contains the values you are trying to filter, use
smallCutoff = 1e-10;
LargeCutoff = 1 - smallCutoff;
idx = = loc((edg_t < smallCutoff ) | (edg_t > LargeCutoff));
Even though a value prints as 1, it might be different than 1. For example, it might be 0.99999999999999.
If x contains the values you are trying to filter, use
smallCutoff = 1e-10;
LargeCutoff = 1 - smallCutoff;
idx = = loc((edg_t < smallCutoff ) | (edg_t > LargeCutoff));
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.