BookmarkSubscribeRSS Feed
byrne48
Calcite | Level 5

I want to preform a simple regression/ ANOVA. I am using complex data therefore  weight, cluster and strata variables are required. I am also using a subpopulation and need a domain statement 

 

The code I have attempted is 

proc surveyreg data = mydata
strata sdmvstra;
cluster sdmvpsu;
weight Weightdiet12YR; 
domain mysubpopulation;
class categorical;
model continuous= categorical;
run;

The error I am receiving is: 

NOTE: In data set mydata, total 59842 observations read, 59724 observations with missing values or
non-positive weights are omitted.
NOTE: Strata with only one cluster are collapsed. Use the NOCOLLAPSE option in the STRATA statement
if you do not want to collapse these strata.
NOTE: Strata with only one cluster are collapsed. Use the NOCOLLAPSE option in the STRATA statement
if you do not want to collapse these strata.

 

I have been able to use proc surveymeans and proc surveyfreq without an error of missing values or non positive weights. 

 

Would any be able to help me with this code? Is there another way I should preform this test? 

 

Thanks!

4 REPLIES 4
PaigeMiller
Diamond | Level 26

These are not ERRORs, these are NOTES. These indicate that there may be issues found in your data.

 

If your data has lots of missing values (or non-positive weights), then you need to simplify the model or fix the missing/non-positive weights.

--
Paige Miller
byrne48
Calcite | Level 5
Thanks for the response!
The variable mentioned in my domain is a subpopulation (of approx 27,000 participants), there are no non positive weights for this variable.
There are however missing values, how would I correct this?
Thanks for your help!
PaigeMiller
Diamond | Level 26

You either have to impute values to replace the missings (probably not advisable when it seems like 99% of the data is missing) or remove the variable from the model.

--
Paige Miller
SAS_Rob
SAS Employee

The differences you are seeing regarding the NOTEs in the SURVEYXXX procedures has to do with how each of the procedures handles single observation strata.  

If there is only one sampling unit in a stratum, then there is no way to estimate a variance in such a case directly.  You can either remove the offending strata from the variance calculation (like SURVEYMEANS and SURVEYFREQ do) or you can collapse those strata into one other strata so that there are no single observation strata (like SURVEYREG does).

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 527 views
  • 0 likes
  • 3 in conversation