BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
snaqvi
Calcite | Level 5

 

Hello all - I was wondering if I could pick your collective brain about approximating a subdomain analysis using proc glimmix.   I am conducting multilevel modeling of a complex probability survey using proc glimmix.  I am able to do an overall analysis just fine, using the weight option to specify weights on my two levels of interest.  However,  I am unable to figure out how to replicate my analysis for a particular subpopulation within my overall dataset.  I know that  i can't just subset my dataset, and that proc glimmix DOES have a BY statement option (and not a domain statement) - and i know that using the BY statement is not considered exactly 'proper' from a technical point of view.  But I don't know WHY that is the case - Would you mind helping me better understand why using the BY statement option is not appropriate for conducting domain analyses with proc glimmix?  Is there anything else you'd recommend I try instead?  Your advice, especially if accompanied by references I could consult, would be extremely helpful. 

 

Thanks so much!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

What about your project makes the SURVEY procs inappropriate.

 

The BY group has the exact same reason that the survey procs do with BY group processing:

 Note that using a BY statement provides completely separate analyses of the BY groups. It does not provide a statistically valid subpopulation or domain analysis, where the total number of units in the subpopulation is not known with certainty.

(my emphasis)

 

You should share your existing code attempts and indicate which variables represent which levels of your population.

 

 

View solution in original post

3 REPLIES 3
ballardw
Super User

What about your project makes the SURVEY procs inappropriate.

 

The BY group has the exact same reason that the survey procs do with BY group processing:

 Note that using a BY statement provides completely separate analyses of the BY groups. It does not provide a statistically valid subpopulation or domain analysis, where the total number of units in the subpopulation is not known with certainty.

(my emphasis)

 

You should share your existing code attempts and indicate which variables represent which levels of your population.

 

 

snaqvi
Calcite | Level 5
Hi there, thanks for replying. Using the survey procs isn’t appropriate with my data because I am trying to conduct a multilevel analysis, and based on the ICC and design effect, i need to do an HGLM since the level 2 variation is actually meaningful given my research question, as opposed to using survey logistic. My dataset is a complex probability survey and consists of children nested within schools . I’m interested in the contribution of school characteristics on child outcomes among a particular sub population of children. Hope this helps! And thanks for the explanation about the BY group! Please let me know if you have any other questions or suggestions for me.
snaqvi
Calcite | Level 5

I found the following paper extremely useful when trying to understand the pitfalls of using a BY statement instead of the domain statement for subpopulation analyses of complex survey data. Posting here for others' benefit:

 

 

A closer examination of subpopulation analysis of complex-sample survey data:

 

https://journals.sagepub.com/doi/pdf/10.1177/1536867X0800800404

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 553 views
  • 0 likes
  • 2 in conversation