There is a lot of good information here, but one piece is missing. Yes, you need the information (error sum of squares) contained in the group standard deviations, but you also need to take into account the variability of the group means about the overall mean in a calculation of the overall sample standard deviation.
You can write a lot of DATA Step and SAS Procedure code to come up with the answer to this question, but SQL can get you there fairly succintly. Try this example shown below. The initial DATA step generates sample data with 5 groups and a varying number of observations per group. The first PROC MEANS step generates the data set described by the original poster, one with just the group n's, means, and standard deviations. The second PROC MEANS step gives the mean and standard deviation for the overall sample. It is this mean and standard deviation that we wish to replicate from the information in the data set GroupStats (the output data set from that first PROC MEANS).
The PROC SQL step gives us that result. SQL is really good at doing many things. Most are more familiar with doing joins using SQL, but it also works well at the type of calculation we need here. The calculation of the overall standard deviation requires computing summary statistics on several different levels. This calculation can be done in a single PROC SQL step.
There are several sources on the web for explaining the derivation of this calculation, including some Youtube videos. See those for the why's of the SQL step below.
data test; call streaminit(613254); do group=1 to 5; do rep=1 to (5+group); y=(group*10) + rand('normal'); output; end; end; run;
proc means data=test; by group; var y; output out=GroupStats n=n mean=mean stddev=std; run;
proc means data=test mean stddev; var y; run;
proc sql; create table SQLFinalStats as select distinct sum_n, OverallMean, sqrt(sum(TSS)/(sum(n)-1)) as OverallStd from ( select n, sum(n) as sum_n, sum(n*mean)/sum(n) as OverallMean, std*std*(n-1) + n*(mean-(sum(n*mean)/sum(n)))**2 as TSS from GroupStats); quit;
... View more