BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anthonywang
Obsidian | Level 7

Hi SAS/IML users:

 

I have a very simply but weird question:

I simulate some dataset, and fit it to a linear mixed model using "MIXED".

I use the parameter estimates to calculate some vectors, and my final results are two column vectors, which are obtained using "probnorm" function in IML.

Then I select some of the values from each vector, based on certain criteria.

I noticed that some values in the final vectors "edg_t" and "noedg_t" equal to 1 (or 0 in some cases), which is caused by extremely large (or small in some cases) values when fitted to "probnorm" function. So I want to exclude those rows whose values are 1 (or 0) by using "loc" function. In this example, there is no 0, so the purpose is to choose rows with value of 1 only.

Somehow, this does not work.

It either only selects 1 row, when there are multiple 1's in the vector (when using other seeds values), or it just reports error message, saying the matrix is not set to a value.

But when I copy the vector that contains 1, as shown in the code and use "loc" again, "loc" function works.

 

Could anyone help me with this weird situation, please? why it does not work here for "noedg_t"?

 

You can change seed values such that there are multiple 1's there, but "loc" does not work.

proc iml;

CALL RANDSEED (10);

n=6;  SampSize=100;  target = 7;  nfe = 4;  nre = 3;

/*TIMEPOINT = 1:4;*/
TIMEPOINT = 4; 



/*binary x_i*/
x_EDG = randfun(SampSize, "Bernoulli", 0.5); 
print x_EDG;

EDG = colvec(repeat(x_EDG,1,n));
print EDG;



/*fixed effect*/
beta = {21.4,1.92,-3.97,0.350}; 

time_i = {0,0,1,2,3,4};
timesqr_i = time_i##2; 
print timesqr_i;

time = shape(repeat(time_i,SampSize),SampSize*n,1);
timesqr = shape(repeat(timesqr_i,SampSize),SampSize*n,1) ;

intcp = j(SampSize*n,1,1);  
X = intcp || EDG || time || timesqr; 
X_design = X;
/*print X;*/

Zi = repeat(1,n)||time_i||timesqr_i;
/*Zi = repeat(1,n)||time_i; */
Z = I(SampSize)@Zi;
Z_design = Z;

/*print Z;*/




	/*error term*/
	vc = 10*I(n);
	/*print vc;*/

	Mean= repeat(0,n);
	eps = RandNormal(SampSize, Mean, vc);
	eps = shape(eps,SampSize*n,1);
	/*print eps;*/


	

	mu = {0  0  0};
	cov = {10.4  0.279  -0.341,0.279  13.06  -2.466, -0.341  -2.466  0.581};  
/*	rnd = RANDNORMAL(SampSize, mu, cov);*/
	rnd = RandMVT(SampSize, 3, mu, cov );

	rndA = shape(rnd, SampSize*3,1);


	Y = X*beta+Z*rndA+eps;
	/*print Y;*/


/*Create ID vectors*/
ID = colvec(repeat(T(1:SampSize),1,n)) ;
/*print ID;*/


dataset = ID||Y||EDG||time||timesqr;
print dataset;



create MyData from dataset [colname={"id", "Y", "EDG", "time"}];
append from dataset;
close MyData;


submit MyData;
ods listing close;
ods output SolutionF = fixed  SolutionR = rnds  CovParms=CovP;
proc mixed data=MyData method=ml COVTEST;
	  class id;
      model Y = EDG time time*time / solution cl covb;
	  random intercept time time*time / SUB=id TYPE=un SOLUTION G GCORR V;
run;
ods listing;
endsubmit;





use rnds;
    read all into rnds[colname=varNames];


intercept_idx= do(1,SampSize*nre,3); 
/*print intercept_idx;*/
time_idx     = do(2,SampSize*nre,3); 
/*print time_idx;*/
timesqr_idx  = do(3,SampSize*nre,3); 
/*print timesqr_idx;*/

EB_intercept= rnds[intercept_idx,"Estimate"];
EB_time     = rnds[time_idx,"Estimate"];
EB_timesqr  = rnds[timesqr_idx,"Estimate"];

print EB_intercept,EB_time,EB_timesqr;

use Fixed; 
	read all var {Estimate} into EB_FIXED;

use CovP;  
	read all var {Estimate} into CovPars;
 


resid_var = CovPars[nrow(CovPars)];


EB_RND = EB_intercept||EB_time||EB_timesqr;

shape_EB_RND = shape(EB_RND,SampSize*nre,1);

mu = X_design*EB_FIXED + Z_design*shape_EB_RND;

print mu;

Benefit = probnorm( (target - (X_design*EB_FIXED + Z_design*shape_EB_RND)) / sqrt(resid_var) ) ; 

Benefit_all = id || EDG || time || Benefit;

print Benefit_all;

/**********************************************************************************************************************/
/**********************************GROUP THE BENEFITS by x_i and at TIMEPOINT******************************************/

edg_rows_t   = loc((Benefit_all[,2]=1) & (Benefit_all[,3]=TIMEPOINT));
noedg_rows_t = loc((Benefit_all[,2]=0) & (Benefit_all[,3]=TIMEPOINT));

 
/*obtain EB benefits at TIMEPOINT*/
edg_t   = Benefit_all[edg_rows_t,ncol(Benefit_all)];
noedg_t = Benefit_all[noedg_rows_t,ncol(Benefit_all)];
print edg_t;
print noedg_t;
  
/**********************************************************************************************************************/

/*loc failed to work here*/
edg_ext = loc((edg_t=0)|(edg_t=1));
noedg_ext = loc((noedg_t=0)|(noedg_t=1));
print edg_ext;
print noedg_ext;


/*this vectors is a section of "noedg_t" that contains a row whoe value is 1, loc works here, but not in the above section.*/
a = {0.8975011 
0.2758047 
9.106E-17 
2.268E-8 
0.0010268 
1 
0.7845039 
0.0003516 
};
b=loc(a=1); print b;

I really appreciate your help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Even though a value prints as 1, it might be different than 1. For example, it might be 0.99999999999999.

If x contains the values you are trying to filter, use 

smallCutoff = 1e-10;

LargeCutoff = 1 - smallCutoff;

idx = = loc((edg_t < smallCutoff ) | (edg_t > LargeCutoff));

 

View solution in original post

2 REPLIES 2
Rick_SAS
SAS Super FREQ

Even though a value prints as 1, it might be different than 1. For example, it might be 0.99999999999999.

If x contains the values you are trying to filter, use 

smallCutoff = 1e-10;

LargeCutoff = 1 - smallCutoff;

idx = = loc((edg_t < smallCutoff ) | (edg_t > LargeCutoff));

 

anthonywang
Obsidian | Level 7
Thank you Rick!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 2 replies
  • 871 views
  • 0 likes
  • 2 in conversation