BookmarkSubscribeRSS Feed
sashyd
Calcite | Level 5

Hi there,

I posted this in SAS programming community but I think it's more relevant here, I will remove the other post.

I'm trying to compare census data to survey data to see if they are from the same distribution or are independent (essentially see how closely the survey data matches census data). One of the ways I'm doing this is to use a chi-squared test for one of my categorical variables. 

There is the proc surveyfreq procedure with the Rao-Scott chi-squared I can use. My survey data has a cluster variable and a strata variable, which is all good, but my census data has neither because it is not a survey but a database of everyone so is already nationally representative. When I run my survey procedure to do a survey adjusted chi squared test, how can I make it clear to SAS that one of my samples (the census) has no strata or cluster but my other sample (the survey) does? 

 

One thing I tried doing is to set all the observations in the census data to have 1 strata and 1 cluster. So that the survey procedure actually works (because the survey procedure won't compare to the census if it has missing values for strata and cluster). However, I don't think this is statistically valid to do? 
Here is the code: 
insurvey is 0 for if its in the census and 1 for if its in the survey.

proc surveyfreq data=work.mydata;
weight weight;
tables insurvey*education / chisq row cl;
strata strata;
cluster psu;
run;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 350 views
  • 0 likes
  • 1 in conversation