I know this may be a simple answer, but I cannot seem to find what I am looking for via my internet searches and thought I would ask here.
I have two datasets. Each dataset has an up to date vaccine variable. There is one dataset for year 2010 and one dataset for year 2011.
I am wanting to see if the rate of individuals that were up to date in year 2010 is significantly different compared to the rate of the individuals up to date in year 2011.
Year 2011 UTD Variable
0 30
1 80
37.5% were up to date
Year 2010 UTD Variable
0 50
1 75
66.67% were up to date
I was able to test if the difference between those were up to date and those who were not was statistically significant by using proc freq and the chisq option.
Just not sure how to compare rates from two different datasets.
Thank you for any help you can provide.
Not sure what test you want. Here is how to do a CHISQ test.
data have ;
input year utd count;
cards;
2011 0 30
2011 1 80
2010 0 50
2010 1 75
run;
proc freq ;
weight count;
tables year*utd / chisq;
run;
The FREQ Procedure
Table of year by utd
year utd
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
2010 | 50 | 75 | 125
| 21.28 | 31.91 | 53.19
| 40.00 | 60.00 |
| 62.50 | 48.39 |
---------+--------+--------+
2011 | 30 | 80 | 110
| 12.77 | 34.04 | 46.81
| 27.27 | 72.73 |
| 37.50 | 51.61 |
---------+--------+--------+
Total 80 155 235
34.04 65.96 100.00
Statistics for Table of year by utd
Statistic DF Value Prob
------------------------------------------------------
Chi-Square 1 4.2210 0.0399
Likelihood Ratio Chi-Square 1 4.2567 0.0391
Continuity Adj. Chi-Square 1 3.6732 0.0553
Mantel-Haenszel Chi-Square 1 4.2031 0.0404
Phi Coefficient 0.1340
Contingency Coefficient 0.1328
Cramer's V 0.1340
Fisher's Exact Test
----------------------------------
Cell (1,1) Frequency (F) 50
Left-sided Pr <= F 0.9861
Right-sided Pr >= F 0.0273
Table Probability (P) 0.0134
Two-sided Pr <= P 0.0531
Sample Size = 235
As far as I know you can't.
You'll need to get them into a single dataset...there's a ton of ways to do this but one way is to set the data together and include the year, then run proc freq.
How about PROC COMPARE? This assumes you have one row in each table for each individual and you have a common key identifying the individuals. You can also configure the method used to identify differences and the size of the difference.
proc compare base = dataset2010
compare = dataset2011
out = difs
OUTNOEQUAL LISTEQUALVAR LISTCOMPVAR LISTBASEVAR
MAXPRINT=300
;
id individual_ID;
var rate;
where UTD = 1;
run;
I will have to try this method and see how it works. I have not yet been able to try it on my data but will let you know if it works out. thank you for the suggestion!
Not sure what test you want. Here is how to do a CHISQ test.
data have ;
input year utd count;
cards;
2011 0 30
2011 1 80
2010 0 50
2010 1 75
run;
proc freq ;
weight count;
tables year*utd / chisq;
run;
The FREQ Procedure
Table of year by utd
year utd
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
2010 | 50 | 75 | 125
| 21.28 | 31.91 | 53.19
| 40.00 | 60.00 |
| 62.50 | 48.39 |
---------+--------+--------+
2011 | 30 | 80 | 110
| 12.77 | 34.04 | 46.81
| 27.27 | 72.73 |
| 37.50 | 51.61 |
---------+--------+--------+
Total 80 155 235
34.04 65.96 100.00
Statistics for Table of year by utd
Statistic DF Value Prob
------------------------------------------------------
Chi-Square 1 4.2210 0.0399
Likelihood Ratio Chi-Square 1 4.2567 0.0391
Continuity Adj. Chi-Square 1 3.6732 0.0553
Mantel-Haenszel Chi-Square 1 4.2031 0.0404
Phi Coefficient 0.1340
Contingency Coefficient 0.1328
Cramer's V 0.1340
Fisher's Exact Test
----------------------------------
Cell (1,1) Frequency (F) 50
Left-sided Pr <= F 0.9861
Right-sided Pr >= F 0.0273
Table Probability (P) 0.0134
Two-sided Pr <= P 0.0531
Sample Size = 235
There is a problem. That is correlation.
If these two year's experiments were applied at the same patient. then there is a correlated effect.
You can not directly use these data into proc freq .
Need to subtract between them to remove this correlated effect.
Ksharp
Thank you pointing this out Ksharp. I need to keep this in mind and take into account the correlated effect.
I've never tried this, but just found it in a 2007 SAS-L thread. I think you should just get the numbers from running the two proc freqs and then apply them to a "proportions test".
Here is what I discovered in that 2007 thread:
Try the SAS built-in tool for proportion test. It's under
Solutions-->Analysis-->Analyst to open the Analyst Window; then
Statistics-->Hypothesis Tests--> One-Sample (or Two-Sample) Test for
Proportions. You can only test the equality of proportions between two
regions each time (I am not sure if this is true or I haven't just
found the right option)
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.