- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am running a series of nested fixed effects models using proc glm with the "absorb" command. To ensure each model is analyzing the same set of observations, I am trying to create a new dataset/output of only the observations that are included in my most restricted model (most variables in the model). Apparently, "Output data set not available when absorption is used."
Are there any other ways to do this??
My most restricted model looks like:
proc glm data=long0f_b;
absorb hhidpn;
model cog = soc6 age baseage*age dummy2020 soc6*dummy2020 / solution;
run;
quit;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So you can find the list of IDs that have 2 records where all variables in the model are non-missing like this:
I am going to type
... list of variables ...
where you should replace it with the actual list of variables (separated by a comma) in the next block of code
data is_missing;
set have;
miss = nmiss(... list of variables ...)>0;
run;
proc freq data=is_missing;
tables hhidpn*miss/noprint out=_a_;run;
This output data set named _A_ is what you want. You want to select all the HHDIPNs where MISS=0 and COUNT>=2.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mct2181 wrote:
I am running a series of nested fixed effects models using proc glm with the "absorb" command. To ensure each model is analyzing the same set of observations, I am trying to create a new dataset/output of only the observations that are included in my most restricted model (most variables in the model). Apparently, "Output data set not available when absorption is used."
I think that message is pretty clear. But ... the documentation says the same thing: "The GLM procedure cannot produce predicted values or least squares means (LS-means) or create an output data set of diagnostic values if an ABSORB statement is used."
Are there any other ways to do this??
Do what? Why do you feel you need ABSORB? What is the end goal of all of this analysis?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The end goal of this particular question is to ensure that the model: cog = soc6 is only analyzing the same subsample of observations that are included in the multivariate models. I had originally thought to use output just to get the list of obs that are included in the most restricted model; I don't need any of the associated statistics like predicted values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Follow-up questions ... how many levels of the person ID?
I still don't see why you need to get "the list of obs that are included in the most restricted model" via the method you are describing. And why wouldn't an observation be included in that model?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There are some 4300 different hhidpns. A given hhidpn (participant) will only be included in this proc glm model if there are at least two separate observations (two different rows) with non-missing data for each variable in the model. As I include new variables, some of the HHIDPNs will get dropped from the model, because they don't have enough non-missing data to be included. I want to make sure I'm running every model on the exact same sample of participants. I definitely don't have to use output to get the that list - I just don't have an idea of another way to do it. Any suggestions are welcome and thank you so much for your help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mct2181 wrote:
A given hhidpn (participant) will only be included in this proc glm model if there are at least two separate observations (two different rows) with non-missing data for each variable in the model.
Should this say "... at least two separate observations (two different rows) where all model variables are non-missing"?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So you can find the list of IDs that have 2 records where all variables in the model are non-missing like this:
I am going to type
... list of variables ...
where you should replace it with the actual list of variables (separated by a comma) in the next block of code
data is_missing;
set have;
miss = nmiss(... list of variables ...)>0;
run;
proc freq data=is_missing;
tables hhidpn*miss/noprint out=_a_;run;
This output data set named _A_ is what you want. You want to select all the HHDIPNs where MISS=0 and COUNT>=2.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much!