BookmarkSubscribeRSS Feed
ANKH1
Pyrite | Level 9

Hello, 

 

We are working with survey data and we want to know if the proportions of  categories A, B and C are different for dependent variable which is (Y/N). Is there a way to test if the proportions between A and B and A and C are statistically significant? Or can we subset the data and run a simple chisq between A and B and another test for A and C? 

 

proc surveyfreq data = sample1;
table DV*Categories/col row nostd nowt chisq ;
weight WTS_P;
run;

 

Thank you!

6 REPLIES 6
SAS_Rob
SAS Employee

You will need to subset the data to get the 2x2 comparisons and then run SURVEYFREQ independently for each comparison.  The trick however is that in subsetting, you will need to make sure you treat the groups as true subgroups (i.e. domains) in order to make sure the standard errors are correct.  To do this, you will need to set all the groups not in the comparison to missing and then use the NOMCAR option on the SURVEYFREQ statement.

 

Roughly speaking it would look like this, creating a new variable for each comparison.

 

data new;

set old;

if category in ('A', 'B') then newcatab=category;

else newcatab=.;

 

proc surveyfreq data=new nomcar;

tables newcatab*sex ....;

....

run;

ANKH1
Pyrite | Level 9

Thank you very much! It worked. 

ANKH1
Pyrite | Level 9

Sorry, it ran, but I am not sure how to interpret the output. I don't think it is comparing the proportions of A and B, but it is comparing A with all the other groups lumped together: 

This is the code:

 

data sample2;
set sample1;
if CAT in ('A', ' B') then CAT2=CAT;
else CAT2=.;
RUN;

ods graphics off;
proc surveyfreq data=sample2 nomcar;
tables CAT2*SEX/col row nostd nowt wchisq chisq;
weight WTS_P;
run;

 

This is the output:

 

Capture.PNG

 

 

I would like to compare the row percents of A with B, A with C and A with D. This is the table:

Capture2.PNG

 

Is it possible to do these comparisons making sure, as you said before, with the correct standard errors?

 

Thank you for your help

 

 

 

SAS_Rob
SAS Employee

There is something odd about the output that you attached.  It appears to be for a different table, namely, ANIMALCAT2*EAR_F_PROTEIN.  Can you double-check to make sure you are looking at the right output.  The other thing I would check is that you have properly subsetted.  Here is an example using that table you sent.

data test;
do animalcat='A','B','C','D';
do sex='M','F';
input count;
do i=1 to count;
output;
end;
end;
end;
datalines;
67
51
424
166
1612
155
920
30
;

data subset;
set test;
if animalcat in ('A','B') then animalcat2=animalcat;
run;

proc surveyfreq data=subset;
tables animalcat*sex;
run;

proc surveyfreq data=subset nomcar;
tables animalcat2*sex;
run;

ANKH1
Pyrite | Level 9

Hello, 

Sorry, I didn't copy the right output. How can I check I'm sub-setting the data correctly? It will be more efficient if I did not have to copy the datalines since this is a huge data set and I have to run tests multiple times with different dependent variables. You mentioned before that it is important to include the other groups that are not being compared as missing values. Why is this missing on the last code?

This is the code that I used:

data sample2;
set sample1;
if ANIMALCAT in ('A', 'B') then ANIMALCAT2=ANIMALCAT;
RUN;

 

ods graphics off;
proc surveyfreq data=sample1;
tables ANIMALCAT*EAR_F_PROTEIN/col row nostd nowt wchisq chisq;
weight WTS_P;
run;

 

ods graphics off;
proc surveyfreq data=sample2 nomcar;
tables ANIMALCAT2*EAR_F_PROTEIN/col row nostd nowt wchisq chisq;
weight WTS_P;
run;

 

This is the output for the last proc. 

Capture.PNG

How can I compare A and B, A and C and A and D?

 

Thank you in advance

 

ANKH1
Pyrite | Level 9

Hi, 

 

I figure what was wrong. I left a space between the quotation and the category B

data sample2;
set sample1;
if ANIMALCAT in ('A', '  B') then ANIMALCAT2=ANIMALCAT;
else ANIMALCAT2='.';
RUN;

 

Now without the space I get the right output:

 

data sample2;
set sample1;
if ANIMALCAT in ('A', 'B') then ANIMALCAT2=ANIMALCAT;
else ANIMALCAT2='.';
RUN;

 

Capture.PNG

Capture.PNG

 

So this is comparing A and B taking into account the other two groups for standard errors, right?

 

Please confirm I am interpreting this correctly.

 

Thank you again

 

 

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2131 views
  • 0 likes
  • 2 in conversation