About lvm

lvm · ‎08-24-2016

Just to add to Steve's comments.... You have two random statements, but only the second one (for residual) is written with explicit subject= syntax. Thus, for the overall model fit, observations are not processed by subject, because there is no unique (single) subject designation (there are, implicitly, different levels of subjects). In the output table under these circumstances, you get the number of subjects you mentioned. But your results should be correct.

lvm · ‎08-23-2016

In the world of mixed models, one has to drop old pre-mixed-model concepts, such as integer degrees of freedom. You are using the Satterthwaite df calcuation method, which is an estimation based on the estimated variance-covariance G and R matrices. (G for random effects, R for repeated measures). Actually, I recommend the ddfm=KR with any repeated measure. The idea is this: to test a null hypothesis, F (or t) statistic should have an F (or t) statistical distribution when the null hypothesis is true. With mixed models (with random effects, correlations, ...), F (or t) statistic is only approximately distributed as F (or t) under H0; the approximation is best when the denominator df are calculated based on the model and the estimated G and R. Another way to look at this: The traditional df calculation methods do not account for the fact that the variances and covariances are estimated. The KR method does take the uncertainty of the variance-covariance estimates into account in the df calcuation. Moreover, the KR method also adjusts the SEs of the fixed effect estimates based on the uncertainty of the variance-covariance estimates.

lvm · ‎08-16-2016

Your factor analytic structure, FA0(3), is still giving you an unstructured covaraince matrix. It is just contstraining it to be at least positive semi-definite. A valid covariance matrix must be positive semi-definite or posistive definite. It is always a very good practice to use FA0(#) or CHOL when one wants an unstructured G. BY the way, add the G and GCORR options to the RANDOM statement to get a direct display of G (in addition to the FA parameters that determine G).

lvm · ‎08-15-2016

As indicated by Steve, you usually only have to be concerned when you get a message about the Hessian being non-positive definite.

lvm · ‎08-10-2016

I doubt if this is something for a future release, since it is not part of model fitting of post-model-fitting testing or estimation. This will have to be a judgement call of the researcher.

lvm · ‎08-10-2016

Just to add to Rick's answer, GLIMMIX is for any (conditional) distribution in the so-called expoential family. Poisson, binomial, negative binomial, etc. Because the normal distribution is a member of this family, GLIMMIX can also be used for the normal distributions. Thus, many now use GLIMMIX instead of MIXED for normal response data. SAS continues to add new features to GLIMMIX, but new developments with MIXED are pretty much halted. (But there are some features in MIXED for normal data that have not been implemented in GLIMMIX).

lvm · ‎08-10-2016

The statement estimate 's2: 1' int 1 | int 1 / subject 0 1 cl ilink e ; only is of direct use if there are no fixed effects in the model (only the intercept). The statement can be used with factors or covariates, but the meaning is different in each case. In fact, this statement is likely not giving you what you want with fixed effects. By the way, it is useful to add an E option to these statements to see what GLIMMIX is doing, because the procedure fills in the blanks when the fixed effects are not explicitly given. For instance, with a continuous variable, the procedure is using 0 for X; this is meaningless for you, since the smallest value of your covariable is 36. I know you want to get a correction for fixed effects, but if there are fixed effects, then your EBLUPs really should be for specific values of the fixed effects. See my comments below on different variations of your possible model. This estimate statement makes sense here without fixed effects. You get a logit of 0.564 and SE of 0.532). PROC glimmix data=school NOCLPRINT MAXLMMUPDATE=100 ; title 'no fixed effects'; class school school_type; model final_pass = / s cl dist=bin link=logit ; random intercept/ subject=school solution cl ; estimate 's2: 1' int 1 | int 1 / subject 0 1 cl ilink e ; run; For all the runs from now on, I give the simple version of the BLUP statement (allowing for defaults in GLIMMIX), listed as "1", and then I give the exact same statement where the missing terms are explicitly given (listed as "1b"). That is the first two estimate statements give the exact same answer for each run. As you can see, the first estimate statement really is doing the calculation for entry_score=0 (an impossible value for this data set). You average value of the covariable is about 70, so I added an estimate statement with this value. You can see that one is now getting a similar EBLUP as found with no fixed effects. The SE is different. PROC glimmix data=school NOCLPRINT MAXLMMUPDATE=100 ; title 'only continuous fixed'; class school school_type; model final_pass = entry_score / s cl dist=bin link=logit ; random intercept/ subject=school solution cl ; estimate 's2: 1' int 1 | int 1 / subject 0 1 cl ilink e ; estimate 's2: 1b' int 1 entry_score 0 | int 1 / subject 0 1 cl ilink e; estimate 's2: X=70' int 1 entry_score 70 | int 1 / subject 0 1 cl ilink e; run; Things are trickier with a factor (class variable). Your data set is not balanced -- there are about twice as many public as private schools. Note that here your first estimate statement really is giving the EBLUP for the LSMEAN of the two types of schools (1b) (this is not the marginal means because of the imbalance). You can kind of recover an EBLUP similar to the no-fixed-effect run by using the 0.16 and 0.84 coefficients for school type (1c estimate) (not same ratio as actual data because this model fitting is not for a linear model -- there is skewness that affects results). I actually think you should be looking at EBLUPs for each school type, not trying to get a global shool type. I give those estimate statements. PROC glimmix data=school NOCLPRINT MAXLMMUPDATE=100 ; title 'only categorical fixed'; class school school_type; model final_pass = school_type / s cl dist=bin link=logit ; random intercept / subject=school solution cl ; estimate 's2: 1' int 1 | int 1 / subject 0 1 cl ilink e ; *identical to next one; estimate 's2: 1b' int 1 school_type 0.5 0.5 | int 1 / subject 0 1 cl ilink e ; estimate 's2: 1c' int 1 school_type 0.16 0.84 | int 1 / subject 0 1 cl ilink e ; *^because of imbalance in numbers in two school types, coefficients are not equal above; estimate 's2: sch type 1' int 1 school_type 1 0 | int 1 / subject 0 1 cl ilink e; estimate 's2: sch type 2' int 1 school_type 0 1 | int 1 / subject 0 1 cl ilink e; run; Thinks get even trickier with continuous and factor both in the model. As mentioned above, there is imbalance in observations in the two school types. Plus, the distribution of the continuous variable is not the same for the two school types. Thus, it is very hard to define a global type of central fixed effect. As you can see below, if you use the first estimate statement, you are getting the EBLUP for X=0 and the LSMEAN of the two school types (not the marginal means) (compare 1 and 1b results). Not a useful result. By trial and error, one can get a kind of central-fixed-effect EBLUP with the third estimate statement (1d). This is a bit strange because it is for X=36 (about the minimum observed X), and for the LSMEANS of the categorical variable. It is likely due to the combination of imbalance and different distributions for X. This is giving you about the same as the first run with no fixed effects. I would prefer if you got EBLUPs for each school type at the mean of the X (about 70). See these estimate statements. PROC glimmix data=school NOCLPRINT MAXLMMUPDATE=100 ; title 'categorical and continuous fixed'; class school school_type; model final_pass = entry_score school_type / s cl dist=bin link=logit ; random intercept/ subject=school solution cl ; estimate 's2: 1' int 1 | int 1 / subject 0 1 cl ilink e ; estimate 's2: 1b' int 1 school_type 0.5 0.5 entry_score 0 | int 1 / subject 0 1 cl ilink e; *^X=0 is meaningless, but that is what you get with first estimate statement; estimate 's2: X=36 (1d)' int 1 school_type 0.5 0.5 entry_score 36 | int 1 / subject 0 1 cl ilink e; estimate 's2: sch typ 1 X=70' int 1 school_type 1 0 entry_score 70 | int 1 / subject 0 1 cl ilink e ; estimate 's2: sch typ 2 X=70' int 1 school_type 0 1 entry_score 70 | int 1 / subject 0 1 cl ilink e ; run; Overall, as I stated in other posts, I don't think there is a good way of doing what you originally wanted. That is, I feel you need to be explicit for the fixed effects if they are in the model.

lvm · ‎08-10-2016

Estimate statements are tricky, and these two do not mean the same thing. I will try to send you a more detailed response soon.

lvm · ‎08-09-2016

I was looking at your table for school A. If you used the code: estimate 's2' | int 1 / subject 0 1 cl ilink; for the situation with fixed effects (you give a categorical and a continuous predictor in your code), then you are getting the EBLUP for the last level of the categorical variable (always a coefficient of 0) and for the continuous predictor with a value of 0 (which may be impossible). You are not getting any kind of 'global mean'. If you want the EBLUP for the second level of the categorical predictor (C) and when the continuous predictor (say X) has a value of 3.5, then you use: estimate 'example' int 1 X 3.5 C 0 1 | int 1 / subject 1 cl ilink; In general, you will get different EBLUPS with and without fixed effect predictors, and the SEs will be smaller with the fixed effects.

lvm · ‎08-09-2016

I am not sure what else you are looking for. One needs to see your full model code to give any more specific advice. You have the general syntax to use. Based on what you showed, we can't tell if you are doing things appropriately. By the way, chapters 6 and 8 in SAS for Mixed Models, 2nd edition (Littell et al. 2006) gives plenty of examples and explanations of this.

lvm · ‎08-09-2016

You are right that NLMIXED is the procedure to use for nonlinear models with clustering (subjects). But NLMIXED is essentially a programming language, and there is no simple way to give you advice at this point. The forum participants usually want to provide help with existing code (corrections, changes, etc.), and typically will not write out code from scratch. Plus, you are not sure about a model to use. I think you will have to first do some homework and learn a bit about NLMIXED, and then figure out some models that you think are appropriate. People on this forum can then give advice on ways to fix problems, and so on. The SAS User's Guide can describe NLMIXED, but the book "SAS for Mixed Models, 2nd edition" is even better to learn about this procedure. In terms of models, I am sure that many have been proposed in the literature for this type of economic data.

lvm · ‎08-08-2016

You seem to have several observations where your response is larger than 1 or smaller than 0. That is why these are "not proportions". Probably you made an error in calculating the proprotion. And more seriously: the beta conditional distribution is defined between 0 and 1 (0 < prop < 1). That means that all 0s and 1s are converted to missing values. You cannot use the beta distribution if you have 0 and 1 unless you want to throw away data. Many references fail to make this clear. To get around this, you only have ad hoc solutions. For instance: Convert all 0s to a very small number (smaller than the smallest nonzero real value that you could observe). Same idea for the 1s. Of course, this is creating artificial data. Would be OK for occassional 0s and 1s, but it appears that you have many.

lvm · ‎08-04-2016

Good question. Although the results are different for some parameter estimates, in fact, both answers are "correct". A different approach is being taken. ALthough you did not say it, it appears that you are taking the exponents of the confidence limits (CLs) for estimated B1, B2, ..., B7. That is, get the CLs for B1, etc., and then use the exponential function. This is often a good practice, and is actually done in GLIMMIX with the EXP option on a LSMEAN statement. This is not what NLMIXED is doing. The latter procedure is using large sample theory to estimate the SE of exp(B_), and then simply calculating the CLs as +/- SE*t using the SE estimate. It is not calculating the exponent of the CLs of B1, etc. The two methods are both reasonable, although the NLMIXED approach can be most justified based on large-sample theory -- small-sample properties not well established. The two approaches could give very similar results or fairly different results. This depends on the actual point estimates, SEs of the estimates, and the sampling distribution of the estimated B's (the latter not really known for small samples). In general, if SE(B) is small, then the two approaches will be more similar. Here is an example. Note that the variance of a function of a random variable (such as an estimated parameter) is var(f(x)) = [f'(x)]^2 * var(x), where here x is the parameter estimate, f(x) = exp(x) in your case, and f'(x) is the first derivative. SE(f(x)) is then just the square-root of this: SE(f(x)) = f'(x)*SE(x). With the special case that f(x) = exp(x), one notes that f'(x) = f(x) = exp(x). Thus, SE(exp(x)) = exp(x)*SE(x). This is known as the delta method, is is an approximated derived from a Taylor series expansion. Consider B2. SE(exp(B2)) = exp(B2)*SE(B2) = 0.8777 * 0.05123 = 0.04496. This is displayed in the Addtional Estiamtes table, and used to get the CLs (0.8777 +/- t*0.04496). This close to the limits you get by getting exp(.) for the limits in the Parameter table. Now consider B3. SE(exp(B3)) = exp(B3)*SE(B3) = 3.4834*0.4281 = 1.4912. This is displayed in your Additional Estimates table. In this case, 3.4834 +/- t*1.49 (displayed as the Lower and Upper limits) is not close to your other method. As I said, both approaches can be justified.

lvm · ‎08-03-2016

You also have more flexibility in working with covariables (continuous predictors) with the estimate statement. LSMESTIMATE is mostly for working with factors (class variables).

lvm · ‎08-03-2016

I respectfully disagree with Jacob. Getting empirical BLUPs is straightforward with GLMMs. All covered in Stroup's book "Generalized Linear Mixed Models" (2013). There can be a small bias in parameter estimates, but not enough to prevent the calculation. Syntax will depend on the exact form of your model and random statements. If you have one covariable (X), then the first "int 1" portion of your estimate statement would be changed to: "int 1 X 5.5" if you wanted to get the BLUP for X=5.5.

Online Status	Offline
Date Last Visited	‎10-02-2024 05:21 PM

Re: mianalyze of lsmestimate

Re: mianalyze of lsmestimate

Re: TEMPLATE: how to combine the equivalent of LAYOUT LATTICE and LAYO...

TEMPLATE: how to combine the equivalent of LAYOUT LATTICE and LAYOUT D...

Re: SAS code for proc glimmix data - interaction analysis

Re: Mixture of chi square with NLMixed in sas

Re: Stepwise Model Selection for longitudinal binary data using PROc G...

Re: Calculating weight for site effect based on standard error

Re: Estimating treatment effects, 2 Group Pre-Post Matched Analysis

Re: Proc Mix insufficient memory issue

Re: GLIMMIX: order of random variable syntax

Re: mianalyze of lsmestimate

Re: mianalyze of lsmestimate

Re: SAS code for proc glimmix data - interaction analysis

poisson regression goodness of fit stats

Re: Stepwise Model Selection for longitudinal binary data using PROc G...

Re: GLIMMIX repeated measures - observations per subject issue

Re: Decimals in Degrees of freedom, proc mixed

Re: Is SAS telling something about V matrix in PROC MIXED?

Re: Is SAS telling something about V matrix in PROC MIXED?

Re: BLUP and the 95% confidence interval of random effect in multileve...

Re: Proc Glimmix vs Proc Mixed

Re: BLUP and the 95% confidence interval of random effect in multileve...

Re: BLUP and the 95% confidence interval of random effect in multileve...

Re: BLUP and the 95% confidence interval of random effect in multileve...

Re: BLUP and the 95% confidence interval of random effect in multileve...

Re: Nonlinear Regression with clusters

Re: PROC GLIMMIX & Missing values

Re: Confidence intervals from ESTIMATE statement in PROC NLMIXED

Re: ESTIMATE VS LSMESTIMATE in PROC MIXED

Re: BLUP and the 95% confidence interval of random effect in multileve...