DATA Step, Macro, Functions and more

dividing summary statistic by noobs in proc summary

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 110
Accepted Solution

dividing summary statistic by noobs in proc summary

Using this code:

 

 

DATA WORK.TestDataSet;
   INPUT Uid Col1 Col2 Target $;
   DATALINES;
	1 5.1 3.5 13.12345
	2 4.9 3.15 2.1345
;

DATA WORK.TestDataSet;
    SET WORK.TestDataSet;
    MSE = (Target - ((Col1 + Col2) + exp(2.345)))**2;
RUN;

proc summary data = WORK.TestDataSet;
 var MSE;
 output out = WORK.ProcSumOut sum=;
run;

proc print data = WORK.ProcSumOut;
run; 

 

I would like to divide the 'computed column' by noobs (the number of rows in WORK.TestDataSet). Is this possible? The result should be in WORK.ProcSumOut.


Accepted Solutions
Solution
‎12-19-2016 09:32 AM
Super User
Posts: 19,822

Re: dividing summary statistic by noobs in proc summary

Posted in reply to csetzkorn

 Unless you have missing values the average is the sum divided by the number of observations so you can calculate your statistics by specifying the mean option

View solution in original post


All Replies
Trusted Advisor
Posts: 1,137

Re: dividing summary statistic by noobs in proc summary

Posted in reply to csetzkorn
I think we cannot divide the computed column in the proc summary. However in a new data step it is possible.

First we need to get the n from proc summary into the WORK.ProcSumOut. Then in the new data step we need to divide the sum with n.

proc summary data = WORK.TestDataSet ;
var MSE;
output out = WORK.ProcSumOut n= sum=;
run;

data WORK.ProcSumOut2;
set WORK.ProcSumOut ;
sum2=sum/n;
run;

Hope this is what you are expecting.
Thanks,
Jag
Frequent Contributor
Posts: 110

Re: dividing summary statistic by noobs in proc summary

Posted in reply to Jagadishkatam

Sorry but your suggestion does not produce the desired output of 151.104. However, the following code, which makes use of _FREQ_. does the job:

 

DATA WORK.TestDataSet;
   INPUT Uid Col1 Col2 Target $;
   DATALINES;
	1 5.1 3.5 13.12345
	2 4.9 3.15 2.1345
;

DATA WORK.TestDataSet;
    SET WORK.TestDataSet;
    MSE = (Target - ((Col1 + Col2) + exp(2.345)))**2;
RUN;


proc summary data = WORK.TestDataSet ;
	var MSE;
	output out = WORK.ProcSumOut sum=;
run;

proc print data = WORK.ProcSumOut;
run;

data WORK.ProcSumOut2;
	set WORK.ProcSumOut ;
	MSE=MSE/_FREQ_;
run;

proc print data = WORK.ProcSumOut2;
run; 
Super User
Posts: 19,822

Re: dividing summary statistic by noobs in proc summary

Posted in reply to csetzkorn


proc means data = WORK.TestDataSet ;
var MSE;
output out = WORK.ProcSumOut sum= Mean= / autoname;
run;

Solution
‎12-19-2016 09:32 AM
Super User
Posts: 19,822

Re: dividing summary statistic by noobs in proc summary

Posted in reply to csetzkorn

 Unless you have missing values the average is the sum divided by the number of observations so you can calculate your statistics by specifying the mean option

Super User
Posts: 5,511

Re: dividing summary statistic by noobs in proc summary

Posted in reply to csetzkorn

If for some reason you want what you are asking for, it's already there.  The output data set automatically contains _FREQ_.  It is the total number of observations (including those that have missing values, so using it in calculations may not give you the mean exactly).

 

You are specifying that the output data set should contain the SUM.  You can also ask for other statistics, such as MEAN (as Reeza mentioned), N (number of observations with a non-missing value), and NMISS (number of observations with a missing value).

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 293 views
  • 2 likes
  • 4 in conversation