Solved: Re: Multilevel survival analysis

Malthy · Posted 09-23-2020 04:02 AM

Hi,

i am investigating individuals being exposed to environmental toxins in early childhood and the risk of development of ADHD. It is a population based cohort study and I did a survival analysis using Poisson regression in proc genmod. Something like this:

proc genmod data=temp order=internal;

class toxin yeargrp SES;

model case=toxin yeargrp SES / dist=poisson link=log offset=lnpyrs type3 covb lrci;

run;

(lnpyrs are person years)

However, I have been made aware that the data have a multi-level structure (subjects within different regions, below the variable f_region), and the analyses should take account of this.

I found that piecewise exponential survival models partition the duration of follow-up into mutually exclusive intervals and fit a model that assumes that the hazard function is constant within each interval. This is equivalent to a Poisson regression model, so I think I should be able to use Proc glimmix and i ran this syntax.

Proc glimmix data=temp method=quad;

class toxin yeargrp SES f_region;

model case=toxin yeargrp SES f_region/ dist=poisson link=log offset=lnpyrs covb cl solution;

random intercept / subject=f_region;

run;

But I am really not sure if this is right?, does the subject=f_region mean that that I actually take the region into account?

I hope to hear from some of you 🙂

Best regards,

Malene

SteveDenham · Posted 09-28-2020 08:49 AM

This is an easy one, @Malthy . Remove toxin_con from the CLASS statement. If you haven't already, be sure the data set is sorted by toxin_con.

SteveDenham

View solution in original post

SteveDenham · Posted 09-23-2020 07:53 AM

Yes, the RANDOM statement does mean that you are accounting for the effect of f_region. If you want to see what is going on with that statement, add SOLUTON as an option. It should display the empirical Bayes estimates for each of the levels of f_region.

SteveDenham

Malthy · Posted 09-23-2020 08:20 AM

Thank you very much for your reply!

I have another question also 🙂

Some of the individuals in my cohort might be siblings so I think that family should be added as another level to the multilevel model because children from the same family are not "independent" observations so I created a variable family_id in which members of the same family have the same family_id and i tried to add this to the syntax as random intercept / subject=family_id (see below) but I get an error...so how can i incorporate it as another level?

Proc glimmix data=temp method=quad;

class toxin yeargrp SES f_region;

model case=toxin yeargrp SES f_region/ dist=poisson link=log offset=lnpyrs covb cl solution;

random intercept / subject=f_region;

random intercept / subject=family_id;

run;

Best regards,

Malene

SteveDenham · Posted 09-23-2020 09:06 AM

Family_id will be nested within f_region, so try this:

Proc glimmix data=temp method=quad;
class toxin yeargrp SES f_region family_id;
model case=toxin yeargrp SES/ dist=poisson link=log offset=lnpyrs covb cl solution;
random intercept / subject=f_region;
random intercept / subject=family_id(f_region);

run;

Key here is that if you just have a few observations within family_id (like >95% are just one) then there may be convergence problems. In this case, you may want to "roll up" the number of cases so that each family_id has a single observation.

SteveDenham

Malthy · Posted 09-23-2020 09:57 AM

Thanks again!

Yes you are right there will probably be to few observations within each family_id.

I am not sure what you mean by "roll up" the number of cases?

Best,

Malene

SteveDenham · Posted 09-23-2020 10:06 AM

Suppose for family_id=1 you have 3 individuals. Currently, you would have each individual in a separate record. By "roll-up", I would get the sum of those three individuals' case counts. You would then analyze cases per family, rather than per individual. This may introduce some overdispersion, so you might want to also consider count distributions that handle overdispersion, such as the negative binomial or generalized Poisson.

SteveDenham

Malthy · Posted 09-23-2020 10:17 AM

Thanks again, yes I will consider that.

Best,

Malene

Malthy · Posted 09-25-2020 08:39 AM

Hi again,

I have one last question (I hope :-)).

Some of the individuals in the cohort have missing values in the SES (Socioeconomic status) variable, do you know how missing values are handled?

I know that STATA drops all observations that have a missing value for any one of the variables used in the model is it the same in SAS?

Proc glimmix data=temp method=quad;
class toxin yeargrp SES f_region family_id;
model case=toxin yeargrp SES/ dist=poisson link=log offset=lnpyrs covb cl solution;
random intercept / subject=f_region;
random intercept / subject=family_id(f_region);

run;

Best,

Malene

SteveDenham · Posted 09-25-2020 08:56 AM

Hi @Malthy

Those records will not contribute to the fit of the model, just as in STATA. You could impute them using PROC MI and then model average, but that is not for anyone but an experienced user, as feeding in the correct values and variance-covariance matrices is not simple.

Note that the mixed model procedures in SAS will not delete records with missing values for the random variables or the dependent variable.

SteveDEnham

Malthy · Posted 09-28-2020 03:26 AM

Hi Steve,

Thank you again for your helpful answers!

So far i have treated the variable toxin as a categorical variable (divided into deciles) and I found the effect of the toxin on my outcome in each decile,

However I would like to investigate the effect on the outcome per 1 unit increase in the toxin treating the toxin variable as continous and I used the same syntax as before.

Proc glimmix data=temp method=quad;

class toxin_con yeargrp SES f_region;

model case=toxin_con yeargrp SES / dist=poisson link=log offset=lnpyrs covb cl solution;

random intercept /subject=f_region;

run;

However, SAS seems to think that the variable is categorical with 1189 levels and obviously this does not work.

So my question is how do i "explain" SAS that this variable should be treated as continous?

Best,

Malene

Malthy · Posted 09-28-2020 03:28 AM

Do you have a suggestion for this problem?

Best,

Malene

SteveDenham · Posted 09-28-2020 08:49 AM

This is an easy one, @Malthy . Remove toxin_con from the CLASS statement. If you haven't already, be sure the data set is sorted by toxin_con.

SteveDenham

Malthy · Posted 09-28-2020 09:31 AM

Ah of course 🙂 Thanks again for your help, I really appreciate it!

Ready to join fellow brilliant minds for the SAS Hackathon?