BookmarkSubscribeRSS Feed
gauglert
Calcite | Level 5

Hi all:

 

I am doing joint modeling of 3 sets of clustered binary responses and 2 continuous measures, and using PROC GLIMMIX to do so, all in SAS Studio 3.6.  I am getting this analysis ready for publication in an academic journal, and as part of the submission process, we will be posting the data and analysis code.  I ran the analysis in the Fall, and haven't done anything with the data since early December.  When I tried to rerun everything to verify that the posted data/code would replicate what we're reporting in the paper, I could not recreate what we had previously obtained.  I have verified that the data and code have not changed, so I am left to believe that something might have changed in GLIMMIX since December 2017.  Is this possible?  

 

The relevant code is 

proc glimmix data=mylib.e1_long_v3 maxlmmupdate=100 pconv = 1e-2 plots=studentpanel;
class location condition ID task dist;
model result = location condition task
task*BF_HITS_OldItems 
task*DRM_HITS_OldItems task*SGI_HITS_OldItems /ddfm=kr dist=byobs(dist) solution covb(details);
random task / subject=ID type=un v vcorr;
covtest diagg/ classical cl ;
run;

 

Thanks!

10 REPLIES 10
ballardw
Super User

Were any updates applied to your system? There was an upgrade to 9.4.5 available in Oct 2017 that maybe was applied later at your site.

 

And how much difference are you seeing? In general with SAS upgrades, which is pretty much the only way I see getting different results unless you changed hardware and/or operating system, I would not expect to see very much difference assuming the code and data are exactly the same.

 

How did you verify that the data and/or code are the same? File creation dates? File update dates? Comparison with a backup?

 

 

gauglert
Calcite | Level 5

Well, there was the issue of me moving from using VMWare to launch SAS Studio to Virtual Box.  Could SAS perform differently on these two platforms?

 

The main piece we're looking at is the estimation of covariance parameters, and since we have 5 tasks, we have a 5x5 covariance matrix.  We are seeing fairly major changes to two of those entries.  In the paper, we were reporting that these two were statistically significant (and not close --> zero was quite far from the bounds), but when I try to replicate, these two CIs capture zero (and not close --> zero is now very much in the middle of the CIs).

 

I verified everything was the same via backups.

gauglert
Calcite | Level 5

Hi all:

 

I am doing joint modeling of 3 sets of clustered binary responses and 2 continuous measures, and using PROC GLIMMIX to do so, all in SAS Studio 3.6.  I am getting this analysis ready for publication in an academic journal, and as part of the submission process, we will be posting the data and analysis code.  I ran the analysis in the Fall, and haven't done anything with the data since early December.    When this work was initially done, I was running SAS Studio in a browser (Safari) from my Mac after launching VMWare.  I have since had problems with my Mac that needed repairs, and upon its return, the VMWare was not able to launch SAS Studio, so I switched to Virtual Box, which launched SAS Studio seamlessly.

 

 

When I tried to rerun everything using the Virtual Box to verify that the posted data/code would replicate what we're reporting in the paper, I could not recreate what we had previously obtained.  I have verified that the data and code have not changed, so I am left to believe that something might work differently in GLIMMIX using VMWare as opposed to Virtual Box.  Is this possible?  

 

The relevant code is 

proc glimmix data=mylib.e1_long_v3 maxlmmupdate=100 pconv = 1e-2 plots=studentpanel;
class location condition ID task dist;
model result = location condition task
task*BF_HITS_OldItems 
task*DRM_HITS_OldItems task*SGI_HITS_OldItems /ddfm=kr dist=byobs(dist) solution covb(details);
random task / subject=ID type=un v vcorr;
covtest diagg/ classical cl ;
run;

 

Thanks!

ChrisNZ
Tourmaline | Level 20

>something might work differently in GLIMMIX using VMWare as opposed to Virtual Box.  Is this possible?  

 

No. The calculations will be the very same in different VM environments, and even when moving your code from PC to mainframe.

 

That's one of SAS's strengths. Your data or date preparation must be different. Maybe you have/had numeric variables with a length not equal to 8?

 

Unless some issue was found in the procedure and fixed? @Rick_SAS ?

 

 

 

 

 

Rick_SAS
SAS Super FREQ

According to the What's New in SAS/STAT 14.3 document, there were no major changes or enhancements to GLIMMIX for the 14.3 release. There might have been minor bug fixes, but I am not aware of any changes that would affect convergence or modeling such as the OP describes.

 

One idea: If the OP has the parameter estimate from the previous release, use those parameter estimates as initial values on the PARMS statement. If the data are the same, the procedure should converge in a few iterations. You can also use options on the PARMS statement (such as NOITER) to see if the likelihood and associated measures (AIC, SBC, etc) are the same as previously reported. 

 

I notice that the OP's code uses PCONV=1e-2 (extremely loose convergence criterion) and MAXLMMUPDATE=100 (five times the usual number of iterations), which makes me wonder if the OP had difficulty getting the model to converge on the previous release as well.

 

Question for the OP:

1. How many observations in the data?

2. How many parameters in this model? There are many class parameters, so there might be dozens of hundreds of parameters.

3. How many subject IDs? I see you are using an unstructured covariance, which requires lots of parameters.

4. What distributions are you specifying in the data? You are using the DIST=BYOBS option. What distributions are you using, and why?

gauglert
Calcite | Level 5

Thanks again for some helpful thoughts.  I did use PARMS to use the previously obtained parameter estimates as starting values.  When I do this without NOITER, I cannot replicate my initial results.  I did use NOITER to verify that I do recover the same likelihood and related quantities.  Given that these are the same, we can all agree that the data have not changed.  So I cannot understand why I was able to obtain results using VMWare that I cannot obtain running it with Virtual Box...

 

To answer your other questions:

1) 2880

2) the covariance matrix is 5x5, and the fixed effects have 22 parameters

3) 144

4) binary and Gaussian, because some are 0/1 and some are continuous measures.

Rick_SAS
SAS Super FREQ

I'm grasping at straws, but are you sure that the previous version actually converged? Is it possible that the algorithm iterated many times and then gave you the values from the last iteration? Do you have the SAS log from before?

 

When you start from the "final" parameter values, the procedure will iterate only if it thinks it can further maximize the log-likelihood (minimize -2LL). If you display the iteration history, do you see the AIC/BIC/SBC/etc values changing in the appropriate directions?  If so, then the previous "final" parameters were not actually optimal.

ballardw
Super User

I wouldn't expect major differences between the VM products but since we see issues with University Edition failing to start after changes to them on occasion that might be related.

 

When I verify a similar issue I will move (note: MOVE not delete) my existing SAS datasets to another folder and rerun all of the code from data import (or extract or subset which ever is appropriate) to analysis, in the original order. I can then create a library pointing to the old data and use proc compare to compare the old and recreated. However this only works if all changes to data are made in code, manual edits will not allow this to work (and may be a place to concentrate comparisons).

 

Note that some sorts of analysis with many variables in a model statement may be data order sensitive due to the internal algorithms used by the procedure.

 

You might check your Glimmix code by changing the order of the variables on the model statement and see if the results change much. If that does happen then the model may need some refining or reconsideration.

Rick_SAS
SAS Super FREQ

SAS Studio is just an interface. What is important is the underlying SAS server. You can see if you have the newest version of SAS by submitting

%put SYSVLONG = &SYSVLONG;

The latest version is

SYSVLONG = 9.04.01M5......

 

Are you using SAS Studio as an interface to a server at your university, or are you using the free SAS University Edition. I believe that University Edition sent out an update in December, but you would have had to clisk the "Update" button.

 

gauglert
Calcite | Level 5

Thanks for the comments.  I do have the latest update of the free University Edition.  

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 989 views
  • 0 likes
  • 4 in conversation