BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Heejeong
Obsidian | Level 7

Hello,

 

I have a dataset that contains a total of 2266 participants. After running a PROC GLM analysis, however, the total number of observations used equals 568 participants. I want to run some basic correlations with the final sample to include in my manuscript and I was wondering how I can go about running a correlation (or descriptive analysis) just with this final sample of 568 participants who were included in the final analysis?

 

Thank you in advance for your help!

 

 

 
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Generally the exclusion of records means that one or more variables used on the MODEL statement is missing for the excluded records.

 

One way:

 

Try something like this dummy code to select the records with none of the variables missing. Obviously replace the have set with your data and the variable names with all of the variables on your model statement.

data want;
   set have;
   where not missing(var1) and not missing(var2) and not missing(var3);
run;

View solution in original post

4 REPLIES 4
ballardw
Super User

Generally the exclusion of records means that one or more variables used on the MODEL statement is missing for the excluded records.

 

One way:

 

Try something like this dummy code to select the records with none of the variables missing. Obviously replace the have set with your data and the variable names with all of the variables on your model statement.

data want;
   set have;
   where not missing(var1) and not missing(var2) and not missing(var3);
run;
Reeza
Super User

Use the OUTPUT statement to create a data set with the output data. 

According to the documentation: 

proc glm;
   class a b;
   model y=a b a*b;
   output out=new p=yhat r=resid stdr=eresid;
run;

These statements create an output data set named new. In addition to all the variables from the original data set, new contains the variable yhat, with values that are predicted values of the dependent variable y; the variable resid, with values that are the residual values of y; and the variable eresid, with values that are the standard errors of the residuals.

 

It will have the data used for the model that you can identify if it has a predicted value, or may even just include only those observations, can't recall exactly. 

 

Then you can use that new data set to input to your correlation or statistical procedures.

 


@Heejeong wrote:

Hello,

 

I have a dataset that contains a total of 2266 participants. After running a PROC GLM analysis, however, the total number of observations used equals 568 participants. I want to run some basic correlations with the final sample to include in my manuscript and I was wondering how I can go about running a correlation (or descriptive analysis) just with this final sample of 568 participants who were included in the final analysis?

 

Thank you in advance for your help!

 

 

 

 

ballardw
Super User

@Reeza wrote:

 

It will have the data used for the model that you can identify if it has a predicted value, or may even just include only those observations, can't recall exactly. 

If the ONLY missing variable(s) on the model statement are the dependent variables there will be a predicted value.  So you would need to select records where missing the dependent that include a predicted value.

Heejeong
Obsidian | Level 7

Thank you all so much for your fast and helpful comments!

All of your comments helped me address my problems but I had to choose one so I chose the very first response.

Thanks again for all your help!!

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 492 views
  • 2 likes
  • 3 in conversation