BookmarkSubscribeRSS Feed
WaynokaStarr1
Calcite | Level 5

Hi all.  We are analyzing survey data using surveyfreq.  For multiple different analyses, we've run surveyfreq and obtained the appropriate standard errors (SEs) using multiple variables to the right of the = within table.

 

However, we are experiencing one recurring issue for which we cannot seem to identify the cause.  When we run this particular surveyfreq, with multiple variables to the right of the = within table, all of the SEs for all variables are correct except for one.  However, when we delete all other variables and run the code with just that one variable, we get the correct SE.  The moment we add just one other variable to the list, the SEs for the original variable turn incorrect again.  And these are incorrect on 10X+ magnitude.

 

Note:  This exact same variable, and variable list, is used in multiple other surveyfreqs and works perfectly fine.  We are at a loss as to the cause of the problem.

 

The data is categorical.  This has been double-programmed, and all SEs have been confirmed, except this one situation.  We are trying to figure out why it is happening so that we don't repeat this issue in future coding.

 

A snippet of the code and output are in the attached pdf.  The information in the first piece, with only one variable, and the 3b piece are correct, but the 3a piece with multiple variables is not correct.  The output shows the correct values (80,759 for total), and highlighted yellow are examples of the different values.  For one, the correct value is 36,612, but the other one is showing 314,577.  Additionally, and just as odd, the incorrect set has a total SE that is lower than every single individual SE.

 

Any guidance that you can offer would be much appreciated!

1 REPLY 1
ballardw
Super User

I have no clue why you include a bunch of IF statements without the entire data step much less any actual data to use them with.

 

You should include the LOG from both versions of the Proc Surveyfreq code with all the notes, messages or warnings included.

 

 

Subsetting data with statements like "where  somevar=3;" is not best practice and there is considerable information about that in the documentation for the survey procedures. You would be better off to include that variable in the analysis and then subset the result by sending the whole thing to a data set then selecting the subset of values of interest.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 295 views
  • 0 likes
  • 2 in conversation