Is it possible to get just the single column average of a dataset output into a separate variable without using Proc Means?. I need to obtain the average of the OBP column in the attached spreadsheet( see Column G highlighted with an average of 0.337829493 in cell Q10) and then append it to the end of that same dataset. It can be done in Excel successfully but when I average it in SAS, I get the average for each individual player in the data set and just need that one combined OBP average of 0.337829493 as a variable to append it to the original data set. My code is listed below. Also what would be best to use to append it, Would I just use PROC APPEND? Thanks in advance for any help with this
PROC IMPORT DBMS=xlsx OUT=more_rate replace
DATAFILE="/folders/myfolders/sasuser.v94/baseball_more_rate.xlsx";
GETNAMES=YES;
RUN;
data more_avg2;
set work.more_rate;
TotalOBP=sum(OBP);
AverageOBP=TotalOBP/217;
drop TotalOBP;
run;
PROC IMPORT DBMS=xlsx OUT=more_rate replace
DATAFILE="/folders/myfolders/sasuser.v94/baseball_more_rate.xlsx";
GETNAMES=YES;
RUN;
data more_avg2;
set work.more_rate;
TotalOBP=sum(OBP);
AverageOBP=TotalOBP/217;
drop TotalOBP;
run;
PROC IMPORT DBMS=xlsx OUT=more_rate replace
DATAFILE="/folders/myfolders/sasuser.v94/baseball_more_rate.xlsx";
GETNAMES=YES;
RUN;
data more_avg2;
set work.more_rate;
TotalOBP=sum(OBP);
AverageOBP=TotalOBP/217;
drop TotalOBP;
run;
PROC IMPORT DBMS=xlsx OUT=more_rate replace
DATAFILE="/folders/myfolders/sasuser.v94/baseball_more_rate.xlsx";
GETNAMES=YES;
RUN;
data more_avg2;
set work.more_rate;
TotalOBP=sum(OBP);
AverageOBP=TotalOBP/217;
drop TotalOBP;
run;
To get the average appended to every observation, you will need to go through the data twice ... once to compute the average, then a second time to append the value to each observation. Here's a way to do that:
data more_avg2;
do until (done);
set more_rate end=done;
if OBP > . then do;
numerator + obp;
denominator + 1;
end;
end;
averageOBP = numerator / denominator;
drop numerator denominator;
do until (done2);
set more_rate end=done2;
output;
end;
run;
or this way:
proc sql noprint;
select name into :namelist separated by ','
from dictionary.columns
where libname='WORK' and memname='MORE_RATE'
;
quit;
data more_avg2(drop=_numerator _denominator);
length _numerator _denominator AverageOBP 8;
retain _numerator _denominator;
set work.more_rate end=last;
if not missing(OBP) then
do;
_numerator=sum(_numerator,OBP);
_denominator+1;
end;
output;
if last then
do;
call missing(&namelist);
if not missing(_denominator) then AverageOBP=_numerator/_denominator;
output;
end;
run;
What are you asking for? What do you mean by "append a variable"?
Do you want to make a new variable? Then should it be the same value on every observation? If not then what.
Typically in SAS when you talk about appending you mean adding observations to an existing dataset. If you added an observation with the mean of a variable which variable would you use to store the mean? The same one? But then want values should all of the other variables for that new observation have? And then how the heck would you use it? Now it will mess up any analysis you want to do with your current observations.
SAS datasets are not spreadsheet where you can paste values into random cells. Why not just keep the mean in its own dataset.
So let's make a new dataset called MEANS that has the mean in a variable named Q10.
proc summary data=HAVE nway ;
var OBP;
output out=means mean=Q10 ;
run;
Then if you wanted to do calculations that used OBP and Q10 you could just combine
data analysis;
set have ;
if _n_=1 then set means;
run;
@philjones820 wrote:
Is it possible to get just the single column average of a dataset output into a separate variable without using Proc Means?
PROC IMPORT DBMS=xlsx OUT=more_rate replace
DATAFILE="/folders/myfolders/sasuser.v94/baseball_more_rate.xlsx";
GETNAMES=YES;
RUN;
data more_avg2;
set work.more_rate;
TotalOBP=sum(OBP);
AverageOBP=TotalOBP/217;
drop TotalOBP;
run;
Certainly, an invalid restriction. Why not use PROC MEANS? It is one of the fundamental SAS procedures that everyone using SAS should know.
proc means data=more_avg noprint;
var obp;
output out=_stats_ mean=averageobp;
run;
data want;
if _n_=1 then set _stats_;
set more_avg;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.