Dear Community Members,
I am interested in the statistical significance of the difference in means of two series. Below is how my data looks:
wday trader_id x y
0 1 0.23 .
0 2 0.46 0.11
0 3 . 0.17
.
.
0 140 0.98 0.87
1 1 0.12 0.04
1 4 0.00 0.47
.
.
.
where wday ranges from 0 to 252. I want to calculate the mean of X and Y for each wday, and then test if the difference between the mean of X and mean of Y is statistically significant.
The final output should look like the following:
wday (mean_x) (mean_y) (mean_x-mean_y) t(mean_x-mean_y) p(mean_x-mean_y)
0
1
2
.
.
252
I know how to calculate mean_x, mean_y, t-statistics and p-values via proc means, but I could not figure out how to get the same statistics (t and p) for the difference in means via proc ttest. I can code the formula in SAS, but I am wondering if there is already a built-function to do this (a simple formula is given on page two of: http://noether.uoregon.edu/~dps/243/Notes/notes20.pdf). It is important to keep in mind that x and y have different number of observations for most of wday and theoretically I am not allowed to take the difference between X and Y if one of them is missing (i.e. I cannot use x-y or sum(x,-y) )
Thank you for your time and help in advance.
Best
I think you need to manipulate the results of Proc Ttest in order to achive desired output...
Proc sort data = test;
by wday;
run;
proc means data = test noprint;
var x y;
by wday;
output out = means mean(x) = mean(y) / autoname;
run;
ods output ttests = t_stat;
ods output statistics = other_stat;
/* If your samples that is X and Y are related */
proc ttest data = test;
var x y;
by wday;
run;
/* If your samples that is X and Y are related */
/* Based on the output you want, you need to use this Proc Ttest with PAIRED statement */
proc ttest data = test;
paired x * y;
by wday;
run;
ods output close;
/* If your X and Y are related */
proc sql;
create table final_t_test as
select a.wday,x_mean,y_mean,
b.mean as mean_x_mean_y,
c.probt as p_mean_x_mean_y
from stat_mean as a,
other_stat as b,
t_stat as c
where a.wday = b.wday = c.wday;
quit;
PROC TTEST has a PAIRED statement and BY statement that work for the situation you describe.
I think you need to manipulate the results of Proc Ttest in order to achive desired output...
Proc sort data = test;
by wday;
run;
proc means data = test noprint;
var x y;
by wday;
output out = means mean(x) = mean(y) / autoname;
run;
ods output ttests = t_stat;
ods output statistics = other_stat;
/* If your samples that is X and Y are related */
proc ttest data = test;
var x y;
by wday;
run;
/* If your samples that is X and Y are related */
/* Based on the output you want, you need to use this Proc Ttest with PAIRED statement */
proc ttest data = test;
paired x * y;
by wday;
run;
ods output close;
/* If your X and Y are related */
proc sql;
create table final_t_test as
select a.wday,x_mean,y_mean,
b.mean as mean_x_mean_y,
c.probt as p_mean_x_mean_y
from stat_mean as a,
other_stat as b,
t_stat as c
where a.wday = b.wday = c.wday;
quit;
Thank you for your help. I very much appreciated it.
I should say thanks to you as i got oppotunity to solve your query in SAS/Stat...
By the way nice to have discussion with you...
the dataset name is chandu that contains x y pro v variables requires data
2.5 4.2 dry 2
--------------
output from var x-v and var x--v
how to get about two variables
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.