BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HyunJee
Fluorite | Level 6

I know this may be a simple answer, but I cannot seem to find what I am looking for via my internet searches and thought I would ask here.

I have two datasets. Each dataset has an up to date vaccine variable. There is one dataset for year 2010 and one dataset for year 2011.

I am wanting to see if the rate of individuals that were up to date in year 2010 is significantly different compared to the rate of the individuals up to date in year 2011.

Year 2011     UTD Variable

                              0                    30

                              1                    80

37.5% were up to date

Year 2010     UTD Variable

                              0                    50

                              1                    75

66.67% were up to date

I was able to test if the difference between those were up to date and those who were not was statistically significant by using proc freq  and the chisq option.

Just not sure how to compare rates from two different datasets.

Thank you for any help you can provide.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Not sure what test you want.  Here is how to do a CHISQ test.

data have ;

  input year utd count;

cards;

2011 0 30

2011 1 80

2010 0 50

2010 1 75

run;

proc freq ;

  weight count;

  tables year*utd / chisq;

run;

The FREQ Procedure

Table of year by utd

year      utd

Frequency|

Percent  |

Row Pct  |

Col Pct  |       0|       1|  Total

---------+--------+--------+

    2010 |     50 |     75 |    125

         |  21.28 |  31.91 |  53.19

         |  40.00 |  60.00 |

         |  62.50 |  48.39 |

---------+--------+--------+

    2011 |     30 |     80 |    110

         |  12.77 |  34.04 |  46.81

         |  27.27 |  72.73 |

         |  37.50 |  51.61 |

---------+--------+--------+

Total          80      155      235

            34.04    65.96   100.00

Statistics for Table of year by utd

Statistic                     DF       Value      Prob

------------------------------------------------------

Chi-Square                     1      4.2210    0.0399

Likelihood Ratio Chi-Square    1      4.2567    0.0391

Continuity Adj. Chi-Square     1      3.6732    0.0553

Mantel-Haenszel Chi-Square     1      4.2031    0.0404

Phi Coefficient                       0.1340

Contingency Coefficient               0.1328

Cramer's V                            0.1340

       Fisher's Exact Test

----------------------------------

Cell (1,1) Frequency (F)        50

Left-sided Pr <= F          0.9861

Right-sided Pr >= F         0.0273

Table Probability (P)       0.0134

Two-sided Pr <= P           0.0531

Sample Size = 235

View solution in original post

7 REPLIES 7
Reeza
Super User

As far as I know you can't.

You'll need to get them into a single dataset...there's a ton of ways to do this but one way is to set the data together and include the year, then run proc freq. 

SASKiwi
PROC Star

How about PROC COMPARE? This assumes you have one row in each table for each individual and you have a common key identifying the individuals. You can also configure the method used to identify differences and the size of the difference.

proc compare base = dataset2010

             compare = dataset2011

             out = difs

             OUTNOEQUAL LISTEQUALVAR LISTCOMPVAR LISTBASEVAR

             MAXPRINT=300

             ;

  id individual_ID;

  var rate;

  where UTD = 1;

run;

HyunJee
Fluorite | Level 6

I will have to try this method and see how it works. I have not yet been able to try it on my data but will let you know if it works out. thank you for the suggestion!

Tom
Super User Tom
Super User

Not sure what test you want.  Here is how to do a CHISQ test.

data have ;

  input year utd count;

cards;

2011 0 30

2011 1 80

2010 0 50

2010 1 75

run;

proc freq ;

  weight count;

  tables year*utd / chisq;

run;

The FREQ Procedure

Table of year by utd

year      utd

Frequency|

Percent  |

Row Pct  |

Col Pct  |       0|       1|  Total

---------+--------+--------+

    2010 |     50 |     75 |    125

         |  21.28 |  31.91 |  53.19

         |  40.00 |  60.00 |

         |  62.50 |  48.39 |

---------+--------+--------+

    2011 |     30 |     80 |    110

         |  12.77 |  34.04 |  46.81

         |  27.27 |  72.73 |

         |  37.50 |  51.61 |

---------+--------+--------+

Total          80      155      235

            34.04    65.96   100.00

Statistics for Table of year by utd

Statistic                     DF       Value      Prob

------------------------------------------------------

Chi-Square                     1      4.2210    0.0399

Likelihood Ratio Chi-Square    1      4.2567    0.0391

Continuity Adj. Chi-Square     1      3.6732    0.0553

Mantel-Haenszel Chi-Square     1      4.2031    0.0404

Phi Coefficient                       0.1340

Contingency Coefficient               0.1328

Cramer's V                            0.1340

       Fisher's Exact Test

----------------------------------

Cell (1,1) Frequency (F)        50

Left-sided Pr <= F          0.9861

Right-sided Pr >= F         0.0273

Table Probability (P)       0.0134

Two-sided Pr <= P           0.0531

Sample Size = 235

Ksharp
Super User

There is a problem. That is correlation.

If these two year's experiments were applied at the same patient. then there is a correlated effect.

You can not directly use these data into proc freq .

Need to subtract between them to remove this correlated effect.

Ksharp

HyunJee
Fluorite | Level 6

Thank you pointing this out Ksharp. I need to keep this in mind and take into account the correlated effect.

art297
Opal | Level 21

I've never tried this, but just found it in a 2007 SAS-L thread.  I think you should just get the numbers from running the two proc freqs and then apply them to a "proportions test".

Here is what I discovered in that 2007 thread:

Try the SAS built-in tool for proportion test.  It's under

Solutions-->Analysis-->Analyst to open the Analyst Window; then

Statistics-->Hypothesis Tests--> One-Sample (or Two-Sample) Test for

Proportions.  You can only test the equality of proportions between two

regions each time (I am not sure if this is true or I haven't just

found the right option)

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 8698 views
  • 6 likes
  • 6 in conversation