I am writing my graduate paper which is about drug-drug interaction and that needs to calculate the p-value. Rightnow, I have the df and the chi-squared test value. However, how can i accurately calculate the p value from E-100 to E-200.
This photo clearly describes what I do right now.
The purpose of the DATA step using the CARDS statement was just to create sample data to work with. Variable q contains the quantiles, i.e., the values q for which the p-value is the probability that a random variable with a chi-square distribution exceeds q. You called q "the chi-squared test value" in your initial post (if I understood correctly). Variable df contains the degrees of freedom. All these values were made up to obtain the first three p-values in your table.
Normally a scientific paper contains a section describing the (statistical) methods used. There I would hope to find a hint what "chi-squared test" exactly was applied to the data in your table. It was not one of the usual chi-squared tests for 2x2 tables provided by PROC FREQ. Here is an example trying to replicate the results in the last row ("Gliclazide") of your table:
/* Create input data for a 2x2 table "MI(no/yes) x Gliclazide(no/yes) under Rosiglitazone" */
data test(drop=k);
input n;
MI=mod(_n_,2);
k=dif(n);
if ~MI then n=k;
Glic=_n_>2;
cards;
19971
62409
34
655
;
/* Create 2x2 table, perform chi-square tests and compute odds ratio */
ods exclude FishersExact;
ods output chisq=cs;
proc freq data=test;
weight n;
tables MI*Glic / chisq or;
run;
/* Print p-values with greater precision */
proc print data=cs(obs=4);
format prob best12.;
run;
PROC FREQ output (0=no, 1=yes for variables MI and Glic):
Table of MI by Glic MI Glic Frequency| Percent | Row Pct | Col Pct | 0| 1| Total ---------+--------+--------+ 0 | 42438 | 621 | 43059 | 67.29 | 0.98 | 68.28 | 98.56 | 1.44 | | 68.00 | 94.81 | ---------+--------+--------+ 1 | 19971 | 34 | 20005 | 31.67 | 0.05 | 31.72 | 99.83 | 0.17 | | 32.00 | 5.19 | ---------+--------+--------+ Total 62409 655 63064 98.96 1.04 100.00 Statistics for Table of MI by Glic Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 215.0999 <.0001 Likelihood Ratio Chi-Square 1 286.8605 <.0001 Continuity Adj. Chi-Square 1 213.8639 <.0001 Mantel-Haenszel Chi-Square 1 215.0965 <.0001 Phi Coefficient -0.0584 Contingency Coefficient 0.0583 Cramer's V -0.0584 Odds Ratio and Relative Risks Statistic Value 95% Confidence Limits ------------------------------------------------------------------ Odds Ratio 0.1163 0.0823 0.1644 Relative Risk (Column 1) 0.9873 0.9860 0.9885 Relative Risk (Column 2) 8.4857 6.0109 11.9793 Sample Size = 63064
PROC PRINT output:
Obs Table Statistic DF Value Prob 1 Table MI * Glic Chi-Square 1 215.0999 1.059926E-48 2 Table MI * Glic Likelihood Ratio Chi-Square 1 286.8605 2.402392E-64 3 Table MI * Glic Continuity Adj. Chi-Square 1 213.8639 1.972021E-48 4 Table MI * Glic Mantel-Haenszel Chi-Square 1 215.0965 1.061743E-48
As you can see, the point and interval estimates of the odds ratio match the rounded values "0.12 (0.08-0.16)" in your table, but none of the four chi-square p-values would be rounded to 3.1E-34.
@kennychang wrote:
How can i get the SAS?
Do you mean how to get access to SAS software? I think SAS® OnDemand for Academics would be ideal for you.
Hello @kennychang and welcome to the SAS Support Communities!
I'm curious as to how the p-values in your table are defined because they don't seem to match any of those produced by PROC FREQ (with the CHISQ option of the TABLES statement), applied to the 2x2 tables the odds ratios have been calculated for.
That said, if you have the quantiles and degrees of freedom (none of which is shown in your table), you can use the SDF function to compute the p-values. (Or do you question the accuracy of the SDF function for large quantiles?)
Example:
data have;
input q df;
cards;
923.67 1
705.61 2
548.09 3
;
data want;
set have;
p=sdf('chisq',q,df);
run;
proc print data=want;
format p e8.;
run;
Result:
Obs q df p 1 923.67 1 7.0E-203 2 705.61 2 6.0E-154 3 548.09 3 1.8E-118
The purpose of the DATA step using the CARDS statement was just to create sample data to work with. Variable q contains the quantiles, i.e., the values q for which the p-value is the probability that a random variable with a chi-square distribution exceeds q. You called q "the chi-squared test value" in your initial post (if I understood correctly). Variable df contains the degrees of freedom. All these values were made up to obtain the first three p-values in your table.
Normally a scientific paper contains a section describing the (statistical) methods used. There I would hope to find a hint what "chi-squared test" exactly was applied to the data in your table. It was not one of the usual chi-squared tests for 2x2 tables provided by PROC FREQ. Here is an example trying to replicate the results in the last row ("Gliclazide") of your table:
/* Create input data for a 2x2 table "MI(no/yes) x Gliclazide(no/yes) under Rosiglitazone" */
data test(drop=k);
input n;
MI=mod(_n_,2);
k=dif(n);
if ~MI then n=k;
Glic=_n_>2;
cards;
19971
62409
34
655
;
/* Create 2x2 table, perform chi-square tests and compute odds ratio */
ods exclude FishersExact;
ods output chisq=cs;
proc freq data=test;
weight n;
tables MI*Glic / chisq or;
run;
/* Print p-values with greater precision */
proc print data=cs(obs=4);
format prob best12.;
run;
PROC FREQ output (0=no, 1=yes for variables MI and Glic):
Table of MI by Glic MI Glic Frequency| Percent | Row Pct | Col Pct | 0| 1| Total ---------+--------+--------+ 0 | 42438 | 621 | 43059 | 67.29 | 0.98 | 68.28 | 98.56 | 1.44 | | 68.00 | 94.81 | ---------+--------+--------+ 1 | 19971 | 34 | 20005 | 31.67 | 0.05 | 31.72 | 99.83 | 0.17 | | 32.00 | 5.19 | ---------+--------+--------+ Total 62409 655 63064 98.96 1.04 100.00 Statistics for Table of MI by Glic Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 215.0999 <.0001 Likelihood Ratio Chi-Square 1 286.8605 <.0001 Continuity Adj. Chi-Square 1 213.8639 <.0001 Mantel-Haenszel Chi-Square 1 215.0965 <.0001 Phi Coefficient -0.0584 Contingency Coefficient 0.0583 Cramer's V -0.0584 Odds Ratio and Relative Risks Statistic Value 95% Confidence Limits ------------------------------------------------------------------ Odds Ratio 0.1163 0.0823 0.1644 Relative Risk (Column 1) 0.9873 0.9860 0.9885 Relative Risk (Column 2) 8.4857 6.0109 11.9793 Sample Size = 63064
PROC PRINT output:
Obs Table Statistic DF Value Prob 1 Table MI * Glic Chi-Square 1 215.0999 1.059926E-48 2 Table MI * Glic Likelihood Ratio Chi-Square 1 286.8605 2.402392E-64 3 Table MI * Glic Continuity Adj. Chi-Square 1 213.8639 1.972021E-48 4 Table MI * Glic Mantel-Haenszel Chi-Square 1 215.0965 1.061743E-48
As you can see, the point and interval estimates of the odds ratio match the rounded values "0.12 (0.08-0.16)" in your table, but none of the four chi-square p-values would be rounded to 3.1E-34.
@kennychang wrote:
How can i get the SAS?
Do you mean how to get access to SAS software? I think SAS® OnDemand for Academics would be ideal for you.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.