BookmarkSubscribeRSS Feed
plf515
Lapis Lazuli | Level 10

I have a data set that has people nested in cities and the dependent variable is ordinal.

The person variable is NQ and the city variable is CITY.  There are about 4,000 people and 30 cities

There are both individual and city level effects.

I tried

proc glimmix data = DSNAME;

class nq city;

model  DV = indlevelvar1 indlevelvar2 ....   citylevelvar1 citylevelvar2 ....

    /dist = mult link = clogit;

random intercept /subject = nq group =  city;

run;

and got "model is too large to run in a reasonable time"

but I am not completely clear on whether I should use  / subject = city or subject = nq   or something else (both ran).

Peter

4 REPLIES 4
SteveDenham
Jade | Level 19

Peter,

I'll try to work from the bottom up here.  What you have now would give a separate variance component due to nq for each city.  Is that a reasonable approach?  I think it would be if you have multiple observations on each individual.  Here I am not so sure.  I would be inclined to view nq as the "error", and city as an additional variance component.  I would try:

proc glimmix data=DSNAME;

class nq city;

model DV(ref='<put something in here that makes sense>' = <fixed effects vector>/dist=mult link=clogit;

random intercept/subject=city;

run;

If you have lots of data relative to the number of variables being fit, I would consider using METHOD=LAPLACE as well, thus giving a conditional response, and putting you in the position of being able to compare models on their IC values.  But with only about 130 people per city, and assuming the DV has 4 levels, you have about 33 records to estimate each level.  Could run into stability and quasi-separation problems.

Steve Denham

plf515
Lapis Lazuli | Level 10

Thanks Steve,

Helpful as always.  That ran.  One thing that worries me is that the df is the same for the city level effects and the person level effects.  Is that correct?  I know estimating the df in these models is tricky and full of options

Peter

SteveDenham
Jade | Level 19

Peter,

I know you are expecting a df around 30 (minus fixed effects estimated) for the city level df, so I think the key here is ddfm=bw (between-within) even though this isn't necessarily a repeated measures design.  I think the default is ddfm=contain, but since you have observations at the nq level, that is may be why it ends up using the residual df for everything.  One way to check would be to estimate the BLUPs for each city by adding the solution option to the RANDOM statement and looking at both specifications.

Steve Denham

plf515
Lapis Lazuli | Level 10

Thanks again!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1423 views
  • 4 likes
  • 2 in conversation