Hello,
I'm trying to accomplish robust standard errors/Empirical variance estimation using sas for my poisson regress for time-to event data.
I have used a time-split macro to model time-dependent covariates for each individual, ID, generationg a dataset with multiple rows per id for each representing each time-stratum for selected time-dependent covariates. Robust errors can easily be obtained by R and STATA. I'm wondering if you can estimate them using SAS proc glimmix or proc genmod (I prefer glimmix to model spline functions for certain covariates, otherwise I can use outdesign= from proc glimmix and then perform analysis using proc genmod) when there are multiple observations per id
Sandwich error estimation can be implemented by using the SAS PROC GENMOD procedure (15) with the REPEATED statement. It is commonly known that this approach can be used to analyze clustered data, such as repeated measures obtained on the same subject (16) or observations arising from cluster randomization trials (17). It is less well known that the same statement with PROC GENMOD can also be used to obtain a robust error estimator when only one observation is available from each cluster. In the present context, this approach can be used to correctly estimate the standard error for the estimated relative risk.
(https://academic.oup.com/aje/article/159/7/702/71883)
This is the only reference I can find and it states "when only one observation is available from each cluster". Solutions could possibly be to generate a new unqiue id for every row - however, I do not grasp the consequences.
Here is an example code using input data with variables ID, blah bla, cov1 cov2 event.
proc glimmx data=input empirical=mbn;
class ID blah bla;
logtime=log(pyrs);
effect aspl=spline(cov1 / NATURALCUBIC BASIS=TPF(NOINT)
notmethod=PERCENTILELIST(5 27.5 50 72.5 95));
effect pspl=spline(cov2/ NATURALCUBIC BASIS=TPF(NOINT) knotmethod=PERCENTILELIST(5 27.5 50 72.5 95));
class ID blah someproperty(ref='0');
model events=blah aspl pspl bla / dist=poisson offset=logtime s cl;
random _residual_ / subject=ID;
run;
Note: An R-side variance component is confounded with the profiled variance.
Check that note out -that an R side variance component is confounded with the error.
So go ahead and identify a row-level ID, which hopefully is applicable across observational IDs. Call it rowblah, I guess. Then try the following (no guarantees that the whole thing won't blow up completely):
proc glimmx data=input empirical=mbn;
class ID blah bla rowblah;
logtime=log(pyrs);
effect aspl=spline(cov1 / NATURALCUBIC BASIS=TPF(NOINT)
notmethod=PERCENTILELIST(5 27.5 50 72.5 95));
effect pspl=spline(cov2/ NATURALCUBIC BASIS=TPF(NOINT) knotmethod=PERCENTILELIST(5 27.5 50 72.5 95));
*class ID blah someproperty(ref='0');
model events=blah aspl pspl bla rowblah aspl*rowblah pspl*rowblah/ dist=poisson offset=logtime s cl;
random rowblah / residual subject=ID;
run;
I commented out the second CLASS statement, as I think it is redundant, and someproperty does not appear in either the MODEL or RANDOM statement. This approach essentially treats each level of rowblah as a repeated measure on the subject. I don't know if it will work, as convergence issues, starting value issues, and other possible problems may be lurking. However, it strikes me as the most easily implemented method.
SteveDenham
ERROR: Integer overflow on computing amount of memory required.
ERROR: The SAS System stopped processing this step because of insufficient memory.
It blew up!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.