About AA1973

AA1973 · ‎05-28-2020

Hi, is there a way to estimate an ICC from a frailty model, such as that shown in output 89.11.5 in phreg's documentation Example 89.11 Analysis of Clustered Data. The model has a random effect for subject and proc phreg prints out a covariance paramater estimate. Link to the example below. https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_phreg_examples11.htm&docsetVersion=15.1&locale=en Thanks!

AA1973 · ‎08-29-2013

A selection algorithm would be a great feature to have in GENMOD. Although automatic selection methods are controversial in some instances, in some cases all one needs is a reasonable good-enough model with some of the noise removed. It would also be great to be able to obtain such model within a reasonable time and without too much programming. In absence of the repeated measures, you could conduct the analysis in R, using the step() function. This function finds a model that minimizes either AIC or BIC, using a backward, forward, or stepwise (both backward and forward) searches. The function should work with models of the following families: binomial, gaussian, Gamma, inverse.gaussian, poisson, quasibinomial, quasipoisson. The quasibinomial and quasipoisson families are the over-dispersed versions of the binomial and poisson, respectively. However, the situation is even more complex when you have repeated measures. As far as I know there are no readily available selection algorithms for generalized linear models with repeated measures. A couple of months ago, I was working on a similar problem, and all I could find was a couple of experimental R packages, and that's about it. Aside from the traditional stats methodology, there are some convoluted ways to approach the problem using data mining techniques, for instance: assuming that all subjects have similar number and timing for the repeated measures, you could conduct cluster analysis for the outcome and transform it into a categorical variable for trajectories (the clusters). Predictors that are time-dependent can also be transformed into trajectories. Then the transformed outcome, a nominal variable, can be used as dependent variable of a non-linear model such as a regression tree; the predictor selection is implicit in the tree-building algorithm. This is likely not implementable in SAS stat alone, as the clustering algorithms are there, but, as far as I know, the regression trees are not part of SAS stat, they are included in the SAS enterprise miner product. The approach can be attempted in R; however, regardless of the software, there are the issues of how many trajectories (clusters) to select, which is not a simple problem, and also what type of tree model to use, as there are many varieties (not sure which are available in SAS enterprise miner). For lack of simpler alternatives, I would suggest a quick-and-dirty approach, albeit imperfect and with risk of bias: in GENMOD you could begin by fixing the correlation structure to exchangeable, and then try a humble backward selection manually, one-at-a-time, using p-values and checking at what point the information criterion (QIC for GEE in GENMOD) is minimized in the backward selection sequence. Select the set of predictors that minimize QIC. Just and idea.

AA1973 · ‎04-26-2012

see thread

AA1973 · ‎04-26-2012

oh jeez, I might get some heat for this, but I think that as long as you can have some direction for the variables or items that allow you to come up with a reasonable interpretation for the solution, then it's fine to run a principal component analysis or even a factor analysis (with an extraction different than maximum likelihood) with the usual proc factor, because the analysis is exploratory and descriptive. You are not testing anything, all you want to know is whether there are some natural groupings among the items or variables. The factor solution should give you an indication of that. However, the situation is more complicated if you want to do a confirmatory factor analysis. In that case a specified model for the variables is tested and unfortunately I do not think that proc calis offers the most up-to-date methodology for "easily" testing those models with variables that are not continuous (and normally distributed), unless you have a huge sample size. You might have to use Mplus for that, yikes.

AA1973 · ‎12-17-2009

Proc logistic calculates all the sensitivity and 1-specificity values for the range of cutoff points. That's how it constructs the ROC curve. You can output these values in a datset that will allow you also to calculate PPV and NPV for the range of cutoff points. Assume you have a numerical variable that you are going to use for discriminating between two groups (cases, coded as 1 and non-cases coded as 0). The following statement will plot the ROC curve, and produce a datatset with the components that will let you calculate specificity, sensitivity, PPV and NPV for each of the values of the numerical variable: ods graphics on; proc logistic data=your_data plots(only)=roc(id=obs); model case (event='1')=numerical_variable; score outroc=data_roc; run; ods graphics off; For each cutoff point (i.e. for each value of the numerical variable), sorted in descending order, the dataset data_roc contains the following variables: _sensit_ (Sensitivity) _1mspec_ (1-specificity) _pos_ (frequency of true cases) _neg_ (frequency of true non-cases) _falpos_ (frequency of non-cases wrongly classified as cases) _falneg_ (frequency of cases wrongly classified as non-cases) With these variables you can easily calculate the specificity: Specificity=1-_1mspec_; Now, to calculate PPV and NPV: PPV=_pos_/(_pos_+_falpos_); NPV=_neg_/(_neg_+_falneg_); Unfortunately the "data_roc" does not have the values of the numerical_variable. So you need to sort the original dataset and merge it with the "data_roc" set to have everything together. Then you can plot each measure or conduct other analyses. The sorting and merging would be something like this: proc sort data=your_data; by descending numerical_variable; run; data all; merge your_data data_roc; run; Hopes this helps.

AA1973 · ‎11-06-2009

Oh yes, this is another way, also appropriate, to test for the effect of the set of three predictors. The conclusions should be equivalent. If I'm not mistaken, the test with the CONTRAST statement is based on linear model theory, so it should produce an F statistic, while the test based on the difference in -2LL is asymptotic. The p-values for both tests should be very similar. Message was edited by: AA1973

AA1973 · ‎11-03-2009

Perhaps you can test the linear contrast that the three parameters of the three psychological variables (assumed to be fixed effects) are equal to the vector (0,0,0). I think you can construct this test with the CONTRAST statement, using only the three psychological variables (you do not need to include all effects that are in the MODEL statement). It probably looks like this: CONTRAST '3 psycho var params= vector 0' psychovar1 1, psychovar2 1, psychovar3 1; Hope this helps

AA1973 · ‎11-03-2009

Thank you very much. It makes sense now.

AA1973 · ‎10-09-2009

Hello- I have run a few models with glimmix and there is something that has confused me and still does. I hope someone can help me figure out what is going on. Right now, I am working on some data whose outcome variable is binary. There are also repeated measurements. So if I run a 'relatively' simple model, say: proc glimmix data=subset5; class wave hhid; model injury(event='1')= arthrit male wave wave*arthrit wave*male / s d=b; random _residual_/ sub=hhid type=cs; run; Where the outcome injury and the covariates arthrit male are coded as 1 or 0. So I obtain my Type III tests of fixed effects. The p values for these are: Num Den Effect DF DF F Value Pr > F arthrit 1 2974 33.73 <.0001 male 1 2974 16.45 <.0001 wave 3 1911 1.83 0.1387 arthrit*wave 3 2974 0.64 0.5902 male*wave 3 2974 0.35 0.7880 However, when I look at the solution, the p-values do not correspond at all. For instance, the p-values for simple effects such as arthrit and male are: Standard Effect wave Estimate Error DF t Value Pr > |t| Intercept -2.0517 0.3033 905 -6.76 <.0001 arthrit 0.4077 0.2432 2974 1.68 0.0938 male 0.2071 0.2719 2974 0.76 0.4464 Can someone please explain to me what is going on? Thank you in advance. Andres

Online Status	Offline
Date Last Visited	‎05-29-2020 03:13 PM

ICC (intra-class correlation) for a frailty model (proc phreg)

Re: Model selection using proc genmod

Re: PCA and dichotomous variables in PROC FACTOR

Re: How to do factor analysis on dummy variables?

Re: ROC curves

Re: PROC MIXED - Testing sset of predictors

Re: PROC MIXED - Testing sset of predictors

Re: p-values in Proc Glimmix

p-values in Proc Glimmix

ICC (intra-class correlation) for a frailty model (proc phreg)

Re: Model selection using proc genmod

Re: PCA and dichotomous variables in PROC FACTOR

Re: How to do factor analysis on dummy variables?

Re: ROC curves

Re: PROC MIXED - Testing sset of predictors

Re: PROC MIXED - Testing sset of predictors

Re: p-values in Proc Glimmix

p-values in Proc Glimmix