Greetings to everyone reading this post,
Consider a simple univariate analysis where we are looking at two variables: anemia and "treatment".
The table I would like to fill looks like such below:
Anemia Grades | Total | 'Treatment A' | 'Treatment B' | p-value |
Grade 1 | ||||
Grade 2 | ||||
Grade 3 | ||||
Grade 4 |
I am struggling to determine how I am able to compare the counts between 'Treatment A' and 'Treatment B' for each Grade category to generate a p-value in this simplistic analysis. I do not need specific code (I prefer struggling through it), but some finger-pointing towards the correct way to do this would be SUPER appreciated.
Thanks!
Charles Jia
I do not need specific code (I prefer struggling through it), but some finger-pointing towards the correct way to do this would be SUPER appreciated.
I like your spirit of discovery . If you just want do a univariate BINOMIAL distribution analysis. Could try this one, you need to make a hypothesis H0: p=? fristly.
The key code of analysis is:
proc freq data=have; table sex/binomial(level='Female'); run;
the output should look like this:
Here is the code I suggested, good luck !
data have;
set sashelp.heart(obs=200);
keep smoking_status sex;
run;
proc sort data=have;by smoking_status;run;
ods select none;
ods output BinomialTest= BinomialTest;
proc freq data=have;
by smoking_status;
table sex/binomial(level='Female');
run;
proc freq data=have noprint;
table smoking_status*sex/out=freq list;
run;
ods select all;
data BinomialTest2;
set BinomialTest;
if Name1='P2_BIN';
keep smoking_status cValue1;
run;
data report1;
merge freq BinomialTest2;
by smoking_status;
run;
proc sql;
create table report2 as
select *,sum(count) as ntotal from report1 group by smoking_status;
quit;
proc report data=report2 nowd;
columns smoking_status ntotal count,sex cValue1;
define smoking_status/group;
define ntotal/group 'Total';
define cValue1/group 'P-Value';
define sex/across 'Treatment';
define count/analysis '';
run;
To compare levels of your treatment variable, you could use the approach shown in this note that uses multiple PROC FREQ steps and then adjusts the p-values for multiplicity. Or you could use the model-based approach with a generalized logit model followed by the NLMeans macro to do the treatment comparisons as illustrated in this note - to estimate differences instead of ratios (relative risks), drop options=ratio in the NLMeans macro call.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.