BookmarkSubscribeRSS Feed
Lkahr
Calcite | Level 5

Hi,

My sample includes 779 observations and many different variables.  

I have fittet a generel linear regression model.

 

Due to missing data in some variables the model only use 702 observations. 

 

proc glm data=myapaf;
class jobkatreg genderkat ;
model ra=durationew age bmi family_status mg_adl_total isi_total genderkat jobkatreg  /solution clparm;
run;

 

I need to describe these 702 persons (mean age etc.) 

Does anyone know how to generate a dataset of these 702 persons?  Preferable in the code above.

 

Thank you !

2 REPLIES 2
PaigeMiller
Diamond | Level 26
proc glm data=myapaf;
    class jobkatreg genderkat ;
    model ra=durationew age bmi family_status mg_adl_total isi_total genderkat jobkatreg  /solution clparm;
    output out=predicteds p=predicted;
run;
data want;
    set predicteds(where=(not missing(predicted)));
run;

The output data set from PROC GLM contains all of your original variables, and a new column named PREDICTED which contains the predicted values. The variable PREDICTED will be missing if any one of the terms in the model are missing, so those are the records you don't want, or to say it the opposite way, you do want the records where PREDICTED is not missing.

--
Paige Miller
jiltao
SAS Super FREQ

Like @PaigeMiller pointed out, you can use the OUTPUT statement to get what you wanted. But I would use the residuals rather than the predicted values to get the observations that are used in the analysis. This is because if the response variable is missing and none of the independent variables is missing, then this observation will not be used in the analysis but you would still get the predicted value for this observation. If you use the residuals, (output out=out r=residuals;)  you would correctly identify those observations included in the analysis (residuals not missing).

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 495 views
  • 5 likes
  • 3 in conversation