p-value for proportions of variables btw included & excluded

Reply
Occasional Contributor
Posts: 13

p-value for proportions of variables btw included & excluded

Hi,

I am trying to calculate p-value for differences between categorical AND continuous variables.  I have included and excluded observations, and want to determine if the proportions for categories within each variable are the same as well as comparing mean (SD) for continuous measures between included & excluded.

To illustrate, here is my Table 1.  Trying to calculate that p-value for each row.

Thank you!Screen shot 2013-07-10 at 9.33.06 PM.png

Trusted Advisor
Posts: 1,137

Re: p-value for proportions of variables btw included & excluded

Posted in reply to joebaker7

Hi,

It is very essential to know, what kind of statistics you wish to apply to get the p values for different categories. I mean if you wish to get the p values of chi square or fisher exact , then you can use the following code

this is for sex:

proc freq data = dataset;

where sex ne '' and trt in ("included","excluded");

table sex* trt / exact;

output out = pvalue exact;

run;

this is for race:

proc freq data = dataset;

where race ne '' and trt in ("included","excluded");

table race* trt / exact;

output out = pvalue exact;

run;


this is for age:

proc freq data = dataset;

where age ne '' and trt in ("included","excluded");

table age* trt / exact;

output out = pvalue exact;

run;


you can apply the same to other categories as well and get the p values.

Hope this helps you.


Thanks,

Jagadish

Thanks,
Jag
Occasional Contributor
Posts: 13

Re: p-value for proportions of variables btw included & excluded

Posted in reply to Jagadishkatam

Thanks much, Jagadish.

This is very helpful.

I am trying to get chi-square, but how do I do so for EACH CATEGORY within a variable.

For instance, if I want to get chi-square for each age group ( 70+, 60-69, 50-59, etc.), between included & excluded, is there a way to do so?  It seems the code above just gets me chi-square for overall difference between included and excluded for the variable age instead of each age category.

Trusted Advisor
Posts: 1,137

Re: p-value for proportions of variables btw included & excluded

Posted in reply to joebaker7

Yes its possible, please use the by statement with age group as the variable, so it will give p values for each group with the age.

proc freq data = dataset;

where age ne '' and trt in ("included","excluded");

by age;

table age*trt / exact;

output out = pvalue exact;

run;

Thanks,

Jagadish

Thanks,
Jag
Occasional Contributor
Posts: 13

Re: p-value for proportions of variables btw included & excluded

Posted in reply to Jagadishkatam

Ok, now I only get a 2x2 table with no exact test.  Here is my code:

proc freq data = joe.extrainclude;

where age10yrs ne . and include in (0,1);

table age10yrs* include / exact;

output out = pvalue exact;

by age10yrs;

run;

Where the variable "include" is 0=observations to include and 1=observations to exclude

but my output doesn't make any sense.

Am I running something wrong?

Screen shot 2013-07-10 at 11.29.05 PM.png

Occasional Contributor
Posts: 13

Re: p-value for proportions of variables btw included & excluded

Posted in reply to joebaker7

and this is the error message i get in my log:

Data set JOE.EXTRAINCLUDE is not sorted in ascending sequence. The current BY group has age10yrs = 4 and the next BY group

       has age10yrs = 1.

Trusted Advisor
Posts: 1,137

Re: p-value for proportions of variables btw included & excluded

Posted in reply to joebaker7

This is because you need to sort the dataset by proc sort before passing that dataset into proc freq.

like this,

proc sort data = joe.extrainclude;

by age10yrs;

run;

Try this and let me know if this has worked for you.

Thanks,

Jagadish

Thanks,
Jag
Occasional Contributor
Posts: 13

Re: p-value for proportions of variables btw included & excluded

Posted in reply to Jagadishkatam

No, now I get a different message, and I still cannot get the exact test because each category within age10yrs variable is computed separately.

Screen shot 2013-07-10 at 11.47.19 PM.png

Occasional Contributor
Posts: 13

Re: p-value for proportions of variables btw included & excluded

Posted in reply to joebaker7

And my output looks like this, for each category of the age10yrs variable, with no exact test ever done:

Screen shot 2013-07-10 at 11.49.03 PM.png

Respected Advisor
Posts: 2,655

Re: p-value for proportions of variables btw included & excluded

Posted in reply to joebaker7

It looks like you need binomial tests.  Try:

proc freq data = dataset;

where age ne '' and trt in ("included","excluded");

by age;

table trt / binomial;

exact binomial;

output out = pvalue binomial;

run;

Steve Denham

Ask a Question
Discussion stats
  • 9 replies
  • 417 views
  • 6 likes
  • 3 in conversation