BookmarkSubscribeRSS Feed
Emma_at_SAS
Lapis Lazuli | Level 10
Thanks Paige Miller!
Do I copy with Ctrl+C?
Where is the </> icon?
SteveDenham
Jade | Level 19

Well, that part looks like it 'works', so now how about showing us output from two runs that gave different results.  By output I mean the whole magilla, as there are often subtle differences that only become apparent as big differences when you look at the final results. If you direct the file to an ODS output pdf file, without the lsmeans, that would be helpful.  Right now, I suspect that not using the standard GLM parameterization is partially to blame.

 

SteveDenham

Rick_SAS
SAS Super FREQ

Where are lines 2-60 in the log? That's presumably where the Long data set is created.

Emma_at_SAS
Lapis Lazuli | Level 10
Rick_SAS, my log does not have rows for 2-60! Is there something wrong with it? I did not check row numbers before...
Rick_SAS
SAS Super FREQ

I have some pressing matters to attend to, so others will have to help you resolve your problems. Good luck.

PGStats
Opal | Level 21

We need to see how dataset LONG was created. It might be the result of an inputation step, i.e. a procedure where missing values are replaced with random values. That can be a perfectly legitimate way to deal with missing values, but must be accounted for when interpreting the results.

PG
Emma_at_SAS
Lapis Lazuli | Level 10
Good point, PGStats. I did not think of that. This is the code to create the long data. I always use PROC TRANSPOSE but for this project, I received the code as follows and I am not sure how that works:

DATA long;

SET wide;

measure = 1; *this is the first observation from the repeated measures;
Var_1 = Var1; *This is variable that is different in the repeated measures;
Var_Y_1 = VarY11; *This is the response variable at first measurement;
VAr_Y_2 = VarY12; *I have another response var that is used in a separate regression analysis;
output;

measure = 2; *this is the second observation from the repeated measures;
Var_1 = Var2; *This is variable that is different in the repeated measures;
Var_Y_1 = VarY21;
VAr_Y_2 = VarY22;
output;

*the following makes the 3rd record pertaining to the product seen third;
measure = 3;
Var_1 = Var3;
Var_Y_1 = VarY31;
VAr_Y_2 = VarY32;
output;
run;

Thank you!
PGStats
Opal | Level 21

The code creating dataset LONG from dataset WIDE doesn't use random functions. So, as long as WIDE is not recreated or modified, LONG should be exactly the same.

The next culprit I can think of is numerical instability. This is quite rare with SAS procedures, but it can happen for very particular data, sometimes when effects are very small over large data values. Some procedures issue a warning when the possibility of this type of numerical instability is detected.

PG
Emma_at_SAS
Lapis Lazuli | Level 10

Thank you, PGStats!

The only warning message in the log is

WARNING: ODS graphics with more than 5000 points have been suppressed. Use the PLOTS(MAXPOINTS= ) option in the PROC MIXED
statement to change or override the cutoff.
 
I cannot match it with "numerical instability". Is there any way to check for that?
Or, you may have other suggestions...
 
Thank you! 
StatsMan
SAS Super FREQ

Check the iteration history for two runs that do not produce the same results. Do the models have the same value of the convergence critirea? Did they take the same number of iterations? If not, then there could be numerical instability in your data this are causing convergence issues. There could be a flat spot in the likelihood function, which could lead to different but equivalent solutions, with this model and data. Try removing effects that are either uninteresting or insignificant and see if the convergence issues clear up. I realize that all the effects in your model might be interesting 🙂

 

The number 1 issue we see with "different" model results when repeated measures are involved is not having the repeated effect on the REPEATED statement. If you have two models with these different REPEATED statements:

 

   repeated / subject=id type=whatever;

vs

   repeated repeated-effect / subject=id type=whatever;

 

then you will get different results in the two models if the data is not sorted by repeated-effect. (Also remember that the variable for the repeated-effect must be on the CLASS statement.)

 

In your psuedo-code, you seem to have the repeated-effect listed on the REPEATED statement. Make sure it is there in all runs. 

SteveDenham
Jade | Level 19

Adding to @StatsMan 's things to check - in the iteration history, is the objective function at the very first iteration identical for the two runs?  If not, then you are definitely fitting different data.

 

SteveDenham

Emma_at_SAS
Lapis Lazuli | Level 10

Thank you StatsMan and SteveDenham for your additional suggestions. I am running models to check these.

 

SteveDenham, I am not clear what you mean by

"is the objective function at the very first iteration identical for the two runs?". What should I check?

 

I am looking at these tables for "Iteration History". 

 

Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 610312.5534  
1 1 610312.5534 0

  

Thank you!

SteveDenham
Jade | Level 19

Ok.  That would be the correct thing to examine for the two runs. If the data being fit are identical, then the iteration 0 values for -2 Res Log Like should be identical.

 

I find it interesting that it appears from this snippet that the model converged on the first iteration.  That could be due to a lot of things, but I find it really unusual for a large dataset and a relatively complex model.  What do the estimates of the random effects look like?  Is there a message in the output like "G matrix is not positive definite."?

 

SteveDenham

Emma_at_SAS
Lapis Lazuli | Level 10

Thank you, SteveDenham! Good point!

The model I ran so far does not contain a RANDOM effect yet.

I am planning to add a random intercept but I was stopped with the different results. 

I tried the model with a random intercept but either the model did not converge or the session went too long that my SAS Studio remote access terminated. 

random intercept/subject=ID;

I am going to buy a local SAS this week and try the codes on that. I assume with a local SAS I can run the code and even it takes 24 hours my session does not terminate.

 

Thank you!

 

 

StatsMan
SAS Super FREQ

Just noticed that in the partial log provided on Monday, TYPE=VC is on the REPEATED statement. Since there are no random effects, this just fits a GLM and explains why the model converges so quickly. 

We really need to see the full log from two runs that produce different results. We also need to see the full output from PROC MIXED for each of those two runs.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 31 replies
  • 1239 views
  • 8 likes
  • 6 in conversation