BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
d6k5d3
Pyrite | Level 9

I have 2 datasets. Dataset X looks like this:

 

Jmp_o   Nws_r
1       0
0       1
1       1
0       1
0       1
1       0
1       0
...    ...

 I calculate the conditional probability P(jmp_o=1|nws_r=1). There is another dataset Y which is like:

Jmp_o   Nws_r
1       0
0       0
1       0
0       0
0       0
1       0
1       0
...    ...

From dataset Y I calculate unconditional probability P(jmp_o=1).

 

I want to test whether these 2 probabilities are statistically different (by means of p-value).

What test should I perform?

 

Much thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

So, all you need is dataset X. Run a Fisher test (proc freq) between jmp_o and nws_r. This will tell you wether the two vars are related in your sample.

PG

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

This isn't a case where statistical testing is appropriate. The formulas used are different, so the results are mathematically different.

 

Statistics would be used only if sampling differences caused different results.

--
Paige Miller
d6k5d3
Pyrite | Level 9

Yes, the two datasets are 2 different samples.

d6k5d3
Pyrite | Level 9
And they are independent.
PaigeMiller
Diamond | Level 26

@d6k5d3 wrote:

Yes, the two datasets are 2 different samples.


This is not clear to me based upon your original explanation.

 

Please explain further.

--
Paige Miller
d6k5d3
Pyrite | Level 9
First I have dataset X from where I calculate the conditional probability. Then from dataset X I create a subsample which has Nws_r=0. After creating dataset Y, I calculate the unconditional probability. Now I need to check whether they are different.
PaigeMiller
Diamond | Level 26

I stick with my previous statement that this is not a case where statistical testing is appropriate.

 

If you have a sample of people, and you measure their heights in inches, and then you take an independent sample and measure their height in centimeters, you would not do a statistical test to determine if the average height in inches differ from the average height in centimeters. You would just assume they are different because a different measurement was used.

--
Paige Miller
PGStats
Opal | Level 21

So, all you need is dataset X. Run a Fisher test (proc freq) between jmp_o and nws_r. This will tell you wether the two vars are related in your sample.

PG

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 895 views
  • 2 likes
  • 3 in conversation