BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Lysegroentblad
Obsidian | Level 7

Hi,

 

I am looking to compare the severity of deer damage between two sites, but I am confised as to how to go about it. Basically the data looks like this:

SiteSeverity
Non-intervention0
Non-intervention0
Non-intervention1
Non-intervention2
Non-intervention0
Non-intervention2
Non-intervention1
Non-intervention1
Non-intervention2
Non-intervention2
Non-intervention0
Non-intervention1
Non-intervention2
Non-intervention0
Planted 0
Planted 1
Planted 2
Planted 2
Planted 0
Planted 1
Planted 2

 

I am looking to compare if the scores of zero are different between the two sites (p-value for 0, 1 and 2). The same goes for the score of 1 and 2 (individually). The number of observations in the non-intervention compared to the planted are very different (900 vs. 240), so I assume it is the frequency I am looking to compare.

I have tried writing a chi-square test in this manor:

 

proc freq data=Thesis;
tables severity*site / chisq nocol norow nopercent;
weight severity; where severity='1'; run;

 

However, when I do this all it gives me is this:

Lysegroentblad_0-1653485637347.png

Which is just the number of observations that equals 1 in each site.

 

I also tried doing this:

 

proc glm DATA = Thesis;
Class site;
Model severity = site; by severity; RUN;

 

That gives me this, which is not useful at all:

Lysegroentblad_1-1653485897939.png

 

Any suggestions?

 

Best regards

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

First, there is a statistical problem here, in that these comparisons are not independent of one another, and so there are really NOT three different independent comparisons.

 

If you perform the overall chi-squared test for your 2x3 table, this will tell you whether or not there is any pattern in the data other than independence of the percents in the table.

 

proc freq data=have;
     tables site*severity/chisq;
run;

 

If this chi-squared test shows to be not statistically significant, I would stop there and say no difference anywhere. If the Chi-square is statistically significant, then you can certainly do the tests in your table (although I can't think of a quick way to do all three).

 

You can compare the first row (54.48 to 68.33) using this code and then you'd have to repeat and modify the code to do all three.

 

proc format;
    value sev 0='0' 1,2='Other';
run;
proc freq data=have;
     tables site*severity/chisq;
     format severity sev.;
run;

Again, I point out the doing all three of these tests isn't really statistically valid as the tests are not independent of one another.

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

I am looking to compare if the scores of zero are different between the two sites (p-value for 0, 1 and 2). ... The number of observations in the non-intervention compared to the planted are very different (900 vs. 240), so I assume it is the frequency I am looking to compare.

 

I assume you mean: are the percents of zero scores different between the two sites ... 

 

If so, I'm unclear on another issue. If you limit the data using

 

where severity='1';

 

as in your code, then you are comparing the 4 times 1 appears next to 'Non-intervention' to the two times 1 appears next to 'Planted', so that's 4 out of 6, which is 66.7% for 'Non-Intervention'. Is that what you want? And then you want to test the 66.7% against the null  hypothesis of 50%? Is that what you want?

 

If no, can you please describe in more detail (using words and math, not in terms of SAS) what test you are trying to do?

--
Paige Miller
Lysegroentblad
Obsidian | Level 7

Yes, I do mean percentages. Because the difference in the size of datasets will give me a significant difference between the frequancy of sites. The number of observations in the non-intervention is approx. 900 while it is 250 in the planted. So I will, as an example, have 300 observations of severity 0 in the non-intervention, but only 50 in the planted, though the percentages are not far apart.

Is is possible to do what @PaigeMiller did (because that works), only with percentages?

 

Best regards

PaigeMiller
Diamond | Level 26

So what do you want to test? You didn't answer that question.

 

Do you want to test 300/900 compared to 50/250?

 

Or do you want to test 300/350 compared to 50/350?

--
Paige Miller
Lysegroentblad
Obsidian | Level 7
 Non-interventionPlanted
054.4868.33
126.1216.25
219.315.42

I want to compare these numbers to test if they are significantly different, meaning 1) is 54.48 % significantly different from 68.33 % 2) is 26.12 % significantly different from 16.25 % 3) is 19.3 % significantly different from 15.42 %.

I does not make sense to compare the frequencies that created these percentages since the non-intervention have 900 observations and the Planted have 250 observations. I did that and it says everything is significantly different (<0.0001). 

What I want to do may not be possible, though. I am aware of that.

 

Maja

PaigeMiller
Diamond | Level 26

First, there is a statistical problem here, in that these comparisons are not independent of one another, and so there are really NOT three different independent comparisons.

 

If you perform the overall chi-squared test for your 2x3 table, this will tell you whether or not there is any pattern in the data other than independence of the percents in the table.

 

proc freq data=have;
     tables site*severity/chisq;
run;

 

If this chi-squared test shows to be not statistically significant, I would stop there and say no difference anywhere. If the Chi-square is statistically significant, then you can certainly do the tests in your table (although I can't think of a quick way to do all three).

 

You can compare the first row (54.48 to 68.33) using this code and then you'd have to repeat and modify the code to do all three.

 

proc format;
    value sev 0='0' 1,2='Other';
run;
proc freq data=have;
     tables site*severity/chisq;
     format severity sev.;
run;

Again, I point out the doing all three of these tests isn't really statistically valid as the tests are not independent of one another.

--
Paige Miller
Rick_SAS
SAS Super FREQ

If I understand your question, you are trying to do a one-way analysis on the frequency of observations for each of the levels Severity=0, 1, and 2. Try this code:

data Have;
input Site	$ Severity;
datalines;
Non-intervention	0
Non-intervention	0
Non-intervention	1
Non-intervention	2
Non-intervention	0
Non-intervention	2
Non-intervention	1
Non-intervention	1
Non-intervention	2
Non-intervention	2
Non-intervention	0
Non-intervention	1
Non-intervention	2
Non-intervention	0
Planted 	0
Planted 	1
Planted 	2
Planted 	2
Planted 	0
Planted 	1
Planted 	2
;

proc sort data=Have;  by Severity; run;

proc freq data=Have;
by Severity;
tables Site / chisq plots=none;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2063 views
  • 1 like
  • 3 in conversation