Hi,
I am trying to calculate p-value for differences between categorical AND continuous variables. I have included and excluded observations, and want to determine if the proportions for categories within each variable are the same as well as comparing mean (SD) for continuous measures between included & excluded.
To illustrate, here is my Table 1. Trying to calculate that p-value for each row.
Thank you!
Hi,
It is very essential to know, what kind of statistics you wish to apply to get the p values for different categories. I mean if you wish to get the p values of chi square or fisher exact , then you can use the following code
this is for sex:
proc freq data = dataset;
where sex ne '' and trt in ("included","excluded");
table sex* trt / exact;
output out = pvalue exact;
run;
this is for race:
proc freq data = dataset;
where race ne '' and trt in ("included","excluded");
table race* trt / exact;
output out = pvalue exact;
run;
this is for age:
proc freq data = dataset;
where age ne '' and trt in ("included","excluded");
table age* trt / exact;
output out = pvalue exact;
run;
you can apply the same to other categories as well and get the p values.
Hope this helps you.
Thanks,
Jagadish
Thanks much, Jagadish.
This is very helpful.
I am trying to get chi-square, but how do I do so for EACH CATEGORY within a variable.
For instance, if I want to get chi-square for each age group ( 70+, 60-69, 50-59, etc.), between included & excluded, is there a way to do so? It seems the code above just gets me chi-square for overall difference between included and excluded for the variable age instead of each age category.
Yes its possible, please use the by statement with age group as the variable, so it will give p values for each group with the age.
proc freq data = dataset;
where age ne '' and trt in ("included","excluded");
by age;
table age*trt / exact;
output out = pvalue exact;
run;
Thanks,
Jagadish
Ok, now I only get a 2x2 table with no exact test. Here is my code:
proc freq data = joe.extrainclude;
where age10yrs ne . and include in (0,1);
table age10yrs* include / exact;
output out = pvalue exact;
by age10yrs;
run;
Where the variable "include" is 0=observations to include and 1=observations to exclude
but my output doesn't make any sense.
Am I running something wrong?
and this is the error message i get in my log:
Data set JOE.EXTRAINCLUDE is not sorted in ascending sequence. The current BY group has age10yrs = 4 and the next BY group
has age10yrs = 1.
This is because you need to sort the dataset by proc sort before passing that dataset into proc freq.
like this,
proc sort data = joe.extrainclude;
by age10yrs;
run;
Try this and let me know if this has worked for you.
Thanks,
Jagadish
No, now I get a different message, and I still cannot get the exact test because each category within age10yrs variable is computed separately.
And my output looks like this, for each category of the age10yrs variable, with no exact test ever done:
It looks like you need binomial tests. Try:
proc freq data = dataset;
where age ne '' and trt in ("included","excluded");
by age;
table trt / binomial;
exact binomial;
output out = pvalue binomial;
run;
Steve Denham
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.