BookmarkSubscribeRSS Feed
proctice
Quartz | Level 8

I have data for two survey years, both separately represent the entire US population for that year after applying weights.  Since they both represent the entire US population I don't think it is appropriate to stack them and treat them as independent samples for the purpose of comparing years.  For example, a chi square test with year as one of the variables or regression with year as a predictor variable.  Please let me know what you think.

 

2 REPLIES 2
ballardw
Super User

Assuming the samples are independent (the folks in the second year were selected due to something from the previous year) treating the year as an independent variable or category is done quite often. The major concern is if the sample is a complex sample, which many of the national surveys are, that the appropriate sample information is provided to the analysis procedure. Hint: Proc Freq is most likely not appropriate. Look at Procs SurveyFreq SurveyMean SurveyLogistic and SurveyReg.

proctice
Quartz | Level 8

I am using the Survey Procs.  I end up with an estimated population size that is double the US population.     

 

I am not sure what you mean by this “(the folks in the second year were selected due to something from the previous year)”.  I assume they used the same sampling plan both years.  I am guessing the samples were independent in that they probably did not end up with the same respondents for both years, but the populations they represent are not independent in that most of the same people who are living in the US in one year are still living there in the next year (minus births, deaths, migration, etc.).  Is it only the independence of the samples that’s important? Or should the populations also be independent?   Thank you.    

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1479 views
  • 0 likes
  • 2 in conversation