The documentation (web-site and PDF page 360) says that PROC SQL supports STD and STDERR, and it doesn't mention an exception for self-joins.
PROC SQL gives the right answer for STD and STDERR when using in a simple query for periods 1-6 without a self-join (see my post above).
This might be a question for technical support because PROC SQL seems to be producing results that are different from what documentation says.
You can try playing with the VARDEF options to see what you get, but I agree, this is for tech support now.
Please post back the results - I'm curious....and a check is to add an N counter to the data. It's possible the merge isn't happening the way I think it is, but it doesn't affect the mean but does for the distribution. You can verify this by looking at the N, number of observations it's using to do the calculation.
I'd bet on the latter being incorrect 😞
EDIT: Yeah, it ends up being a many to many join, which is why the STD and STDERR is off, but the mean is ok. My answer is not correct. You probably need an inline query, but those are really inefficient.
Here's a macro solution with proc means.
proc sql noprint;
select min(period) into :min_period TRIMMED
from have;
select max(period) into :max_period TRIMMED
from have;
quit;
%put &min_period;
%put &max_period;
options nomprint nosymbolgen;
%macro moving_summary(datain=, window=, dataout=);
*remove previous summary table;
proc sql;
drop table &dataout;
quit;
%do i=&min_period. %to %eval(&max_period.-&window.);
proc means data=&datain noprint;
var x;
where period between &i and %eval(&i. + &window. - 1);
output out=_temp mean=mean_x std=std_x stderr=stderr_x;
run;
data _temp;
set _temp;
start = &i;
end = &i + &window - 1;
run;
proc append base=&dataout data=_temp;
run;
proc sql;
drop table _temp;
quit;
%end;
%mend;
%moving_summary(datain=have, window=6, dataout=summary_stats);
Thank you, Reeza
The macro produces the correct standard deviations and standard errors
@agoldma wrote:
opened track # 7612217075 with SAS Tech Support to look at the inaccurate calculation of STD and STDERR
The error is what I said it was, the join ends up being a many to many so it's not the correct join. How SQL calculates the STD and STDERR are correct, check the N that it returns the number of records per each period. That's the clue that the join is wrong. This is an incorrect understanding on my part, nothing wrong with how PROC SQL is working, ie user error.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.