Dear Community Members,
I am interested in the statistical significance of the difference in means of two series. Below is how my data looks:
wday trader_id x y
0 1 0.23 .
0 2 0.46 0.11
0 3 . 0.17
.
.
0 140 0.98 0.87
1 1 0.12 0.04
1 4 0.00 0.47
.
.
.
where wday ranges from 0 to 252. I want to calculate the mean of X and Y for each wday, and then test if the difference between the mean of X and mean of Y is statistically significant.
The final output should look like the following:
wday (mean_x) (mean_y) (mean_x-mean_y) t(mean_x-mean_y) p(mean_x-mean_y)
0
1
2
.
.
252
I know how to calculate mean_x, mean_y, t-statistics and p-values via proc means, but I could not figure out how to get the same statistics (t and p) for the difference in means via proc ttest. I can code the formula in SAS, but I am wondering if there is already a built-function to do this (a simple formula is given on page two of: http://noether.uoregon.edu/~dps/243/Notes/notes20.pdf). It is important to keep in mind that x and y have different number of observations for most of wday and theoretically I am not allowed to take the difference between X and Y if one of them is missing (i.e. I cannot use x-y or sum(x,-y) )
Thank you for your time and help in advance.
Best
I think you need to manipulate the results of Proc Ttest in order to achive desired output...
Proc sort data = test;
by wday;
run;
proc means data = test noprint;
var x y;
by wday;
output out = means mean(x) = mean(y) / autoname;
run;
ods output ttests = t_stat;
ods output statistics = other_stat;
/* If your samples that is X and Y are related */
proc ttest data = test;
var x y;
by wday;
run;
/* If your samples that is X and Y are related */
/* Based on the output you want, you need to use this Proc Ttest with PAIRED statement */
proc ttest data = test;
paired x * y;
by wday;
run;
ods output close;
/* If your X and Y are related */
proc sql;
create table final_t_test as
select a.wday,x_mean,y_mean,
b.mean as mean_x_mean_y,
c.probt as p_mean_x_mean_y
from stat_mean as a,
other_stat as b,
t_stat as c
where a.wday = b.wday = c.wday;
quit;
PROC TTEST has a PAIRED statement and BY statement that work for the situation you describe.
I think you need to manipulate the results of Proc Ttest in order to achive desired output...
Proc sort data = test;
by wday;
run;
proc means data = test noprint;
var x y;
by wday;
output out = means mean(x) = mean(y) / autoname;
run;
ods output ttests = t_stat;
ods output statistics = other_stat;
/* If your samples that is X and Y are related */
proc ttest data = test;
var x y;
by wday;
run;
/* If your samples that is X and Y are related */
/* Based on the output you want, you need to use this Proc Ttest with PAIRED statement */
proc ttest data = test;
paired x * y;
by wday;
run;
ods output close;
/* If your X and Y are related */
proc sql;
create table final_t_test as
select a.wday,x_mean,y_mean,
b.mean as mean_x_mean_y,
c.probt as p_mean_x_mean_y
from stat_mean as a,
other_stat as b,
t_stat as c
where a.wday = b.wday = c.wday;
quit;
Thank you for your help. I very much appreciated it.
I should say thanks to you as i got oppotunity to solve your query in SAS/Stat...
By the way nice to have discussion with you...
the dataset name is chandu that contains x y pro v variables requires data
2.5 4.2 dry 2
--------------
output from var x-v and var x--v
how to get about two variables
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.