About SteveDenham

SteveDenham · ‎06-28-2012

I jerryrigged up some data and fit it with type=un. Duplicates also lead to the infinite likelihood with the message about a nonpositive definite R matrix. If I try GLIMMIX, the warning is that it failed to obtain mivque starting values. For HPMIXED, I get a specific error message that duplicate measures have been detected. So, that is not the problem. Have you tried PROC HPMIXED? Something like: PROC HPMIXED DATA=DATA; CLASS ID TIME; MODEL X=TIME ; test time; REPEATED TIME / SUBJECT=ID TYPE=AR(1) ; LSMEANS TIME/ DIFF=CONTROL('1'): *ADJUST=DUNNETT; RUN; Adjustments are not available in HPMIXED (at least according to my documentation), thus the commenting out of adjust=dunnett. So if you need them, then it would probably be necessary to output using the ODS statement, and then post-process using PROC MULTTEST. And then only if HPMIXED actually works. I am thinking that you REALLY, REALLY need to open a ticket with tech support. Good luck. Steve Denham

SteveDenham · ‎06-27-2012

Yuck. I had hoped that would work. OK--when does the infinite likelihood error occur? Is it at the initial iteration, or does it occur after several iterations? If it is the first case, it is almost certainly a problem with a duplicate record for one of the subjects at one of the timepoints. If it is the second, what is going on in the iteration history? Does it look like there is a relatively smooth history for the objective function up until something happens and it jumps off the tracks? Or is the history erratic? Can't say I have an answer yet, but knowing the answers to these questions might help? Also, it might be time to open a ticket with tech support, especially if you can share the data with them. Steve Denham

SteveDenham · ‎06-26-2012

There looks like a relatively constant correlation from time(i) to time(i+1), which implies to me that an autoregressive error structure may be appropriate. Given that the diagonal entries seem relatively constant, consider type=AR(1) if your time points are equally or very nearly equally spaced, or type=sp(pow)(time1) if they are not evenly spaced. You will need to construct time1 as a continuous variable in previous data step (time1=time), since time is specified as a categorical variable in the class statement. I still fear that there may be some data pathology that is causing the infinite likelihood. See how this works. Steve Denham

SteveDenham · ‎06-26-2012

You won't get an estimate for c under this parameterization, as V3 is a linear combination of V1 and V2. As far as adding in a subject effect, see the first example in Getting Started: NLMIXED Procedure. It is for nonlinear growth curves with gaussian data, but the principles for adding in a subject effect are clearly outlined. You might actually end up with an estimate for c under this model, but it would represent sampling differences more than anything else--it still represents a fully collinear combination of V1 and V2 under the boundary conditions. Good luck. Steve Denham

SteveDenham · ‎06-26-2012

If you are committed to proc reg, rather than the many other linear modeling procs, you will have to create the interaction variable in a data step. You will probably have to do something like: data test3; set test2; x1_x4=x1*x4; run; proc reg data=test3; model y = x1 x2 x3 x4 x1_x4; Hope this gets at what you are trying to accomplish. Steve Denham

SteveDenham · ‎06-22-2012

I concur that it is a 9.2 problem. I ran it on the same machine using SAS 9.2 and duplicated your output The estimates and standard errors are correct in value, but are mislabeled throughout. I looked at the problem notes, and it appears there were some problems with the.aggregate option when using by=, but it looked like that was an 8.2 problem that had been fixed going forward. You really need to get in touch with Tech Support on this one. There may be a hotfix available. Steve Denham

SteveDenham · ‎06-22-2012

I do not get the results presented for the by strata analysis. My strata=2 analysis using by group processing looks exactly like the where strata=2 analysis, and like the results you present. Could this be platform dependent? I am running SAS 9.3 (32 bit) on a Windows XP platform. Steve Denham

SteveDenham · ‎06-22-2012

Hello Amanda, Could you give a reasonably full description of your experimental design? With that info, there are several in the community who will be able to help. However, with what you have given so far, it is hard to help--in fact, I can't even say that PROC GLM is the best tool for what you wish to accomplish. Hope this will get you to post a more complete question. Steve Denham

SteveDenham · ‎06-20-2012

If you want z scores, use your first block of code exactly as it is. The mean= and std= options give the TARGET values, not the values of your sample. Another approach is PROC STDIZE. Something like this: proc stdize data=X out=zscore sprefix=z_ oprefix=orig; var A B C; run; This will give an output dataset with the original variables prefixed with orig and the z scores prefixed with z_. I hope this helps. Steve Denham

SteveDenham · ‎06-20-2012

See my reply in the Statistical Procedures forum. The first code block is what you want to use. Steve Denham

SteveDenham · ‎06-20-2012

Your first block of code will standardize all three variables to a mean of 0, and a standard deviation of 1. This would be a z score. None of the other code blocks will give z scores, but will instead give scaled scores that will look very much like the raw scores, as you are standardizing to the sample mean and standard deviation. Steve Denham

SteveDenham · ‎06-20-2012

PROC GLM can handle all of the outcomes in a single pass, plus it is one of the multithreaded procs, so if you have multiple CPUs available, it can really speed some of the computations. You could try:: proc glm data=yourdata; class gender edulevel; model sbp dbp rasbp lasbp <put as many here as you have>=gender edulevel gender*edulevel age bmi; <insert other stuff here to get lsmeans, etc.> quit; Let us know if this needs more attention. Steve Denham

SteveDenham · ‎06-20-2012

The choice depends essentially on the purpose of your model and the properties of your sample. I can't stress that enough. Just because a model is more complex does not make it better or worse--it all depends on the context of its use. If I were doing a confirmatory analysis, and I believed that var1-var19 were all important factors to examine, and I had enough data, then I would probably fit a model that included all variables plus interactions of var20 with each of var1 through var19. I would then look at the results and see if they made sense, and see if I could eliminate some of the interactions as either nuisance or nonsensical. That would mean looking at a LOT of plots, because I really don't have the ability to think in a 20 dimensional space. Even after a model reduction, I would probably be left with something that would take substantial effort to interpret. I would be worried about collinearity in the continuous variables and highly leveraged points that probably tell me more about my data collection efforts than about the response. I would start to wonder if all of those continuous variables were truly independent of one another, and if any dependence was due to the presence of the categorical variable. At some point, I would reach in the closet and get out my old texts on Mathematical Biology, and see if I might be better off trying to write some sort of structural system of equations, rather than piling everyone into the van and seeing who ended up in good seats. So once I again I say: The choice depends essentially on the purpose of your model and the properties of your sample. Good luck. Steve Denham

SteveDenham · ‎06-19-2012

I want to expand on what PG said: The choice depends essentially on the purpose of your model and the properties of your sample. This is the motto that should be posted over the top of every data analyst's computer screen. And it is what makes answering the question so difficult. What is the question that you are trying to address? Is this to be an exercise in data exploration, or are there well defined questions to be addressed? Are you looking for a parsimonious model that is suited for prediction, or are you interested in the interplay between predictors and the response in an attempt to find support for hypothetical relationships? If you add these questions on to what PG said, then you will have a very good beginning place to guide your decision. Steve Denham

SteveDenham · ‎06-19-2012

Advantage of choice1: You use all the data. Disadvantage of choice1: You use all the data. (Yes, that's what I said) Suppose your data were such that 10% of the observations were from the low group, 20% from the medium group, and 70% from the high group. If you fit a simplistic model with only var20 (as you name it), you will fit all the other parameters based predominately on the values seen in the high group. You might follow your first instinct to fit separate models for each group, but then you run into the problem of comparing across models. You really have no way of testing directly whether the parameter for var1 is the same in each of the groups. You could examine the confidence limits, but it would not be as satisfying as creating confidence limits on the difference under choice1. You might wish to consider a more complex model that includes interactions between the categorical variable and the continuous variables, thus giving parameter estimates that can be compared directly. However, you need to watch for having enough data to adequately fit the additional parameters, as you would be going from estimating 21 (20 vars plus an intercept) to as many as 58 (19 vars at three levels each plus an intercept, depending on the parameterization you use). To get good estimates, you really need three times as much data. This is an opinion, and everyone has them, so take what you like and leave the rest. Steve Denham

Online Status	Offline
Date Last Visited	‎03-19-2026 03:00 PM

Re: Assessing Variable Redundancy for Mixed Effects Modeling

Re: Assessing Variable Redundancy for Mixed Effects Modeling

Re: Randomized block design and meaning of LSMEAN/STDERR

Re: Help with Restricted Cubic Splines : Code Optimization and Graphic...

Re: PROC POWER for Cox regression

Re: Repeated measures model executes in MIXED but not in GLIMMIX

Re: Model heteroscedasticity directly or use log transformation

Re: Model heteroscedasticity directly or use log transformation

Re: Repeated measures model executes in MIXED but not in GLIMMIX

Re: Repeated measures model executes in MIXED but not in GLIMMIX

Re: Passing TESTVALUE option in LSMESTIMATE statement in glimmix

Re: What is the "estimate" in the SolutionR output of the proc mixed. ...

Re: Assessing Variable Redundancy for Mixed Effects Modeling

Re: SAS OnDemand Outpm option in Proc Mixed

Re: question about lsmean pdiff=;option in proc glm step, st102d03, SA...

Re: Proc mixed, defining data structure for desired comparison (Random...

Re: Assessing Variable Redundancy for Mixed Effects Modeling

Re: Assessing Variable Redundancy for Mixed Effects Modeling

Re: Help with Restricted Cubic Splines : Code Optimization and Graphic...

Re: Repeated measures model executes in MIXED but not in GLIMMIX

Re: Analysis of repeated measures data with proc mixed

Re: Analysis of repeated measures data with proc mixed

Re: Analysis of repeated measures data with proc mixed

Re: Regression analysis

Re: interaction term in regression

Re: PROG GENMOD and BY: output shows combinations, which were not in...

Re: PROG GENMOD and BY: output shows combinations, which were not in...

Re: Procedure GLM

Re: que regarding proc standard

Re: que regarding proc standard

Re: que regarding proc standard

Re: SAS macro for multiple linear regression

Re: one model vs multiple models?

Re: one model vs multiple models?

Re: one model vs multiple models?

SAS Analytics Explorers