Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- dividing summary statistic by noobs in proc summar...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-18-2016 03:21 AM

Using this code:

DATA WORK.TestDataSet; INPUT Uid Col1 Col2 Target $; DATALINES; 1 5.1 3.5 13.12345 2 4.9 3.15 2.1345 ; DATA WORK.TestDataSet; SET WORK.TestDataSet; MSE = (Target - ((Col1 + Col2) + exp(2.345)))**2; RUN; proc summary data = WORK.TestDataSet; var MSE; output out = WORK.ProcSumOut sum=; run; proc print data = WORK.ProcSumOut; run;

I would like to divide the 'computed column' by noobs (the number of rows in WORK.TestDataSet). Is this possible? The result should be in WORK.ProcSumOut.

Accepted Solutions

Solution

12-19-2016
09:32 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to csetzkorn

12-18-2016 05:41 AM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to csetzkorn

12-18-2016 03:47 AM

I think we cannot divide the computed column in the proc summary. However in a new data step it is possible.

First we need to get the n from proc summary into the WORK.ProcSumOut. Then in the new data step we need to divide the sum with n.

proc summary data = WORK.TestDataSet ;

var MSE;

output out = WORK.ProcSumOut n= sum=;

run;

data WORK.ProcSumOut2;

set WORK.ProcSumOut ;

sum2=sum/n;

run;

Hope this is what you are expecting.

First we need to get the n from proc summary into the WORK.ProcSumOut. Then in the new data step we need to divide the sum with n.

proc summary data = WORK.TestDataSet ;

var MSE;

output out = WORK.ProcSumOut n= sum=;

run;

data WORK.ProcSumOut2;

set WORK.ProcSumOut ;

sum2=sum/n;

run;

Hope this is what you are expecting.

Thanks,

Jag

Jag

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Jagadishkatam

12-18-2016 10:08 AM

Sorry but your suggestion does not produce the desired output of 151.104. However, the following code, which makes use of _FREQ_. does the job:

DATA WORK.TestDataSet; INPUT Uid Col1 Col2 Target $; DATALINES; 1 5.1 3.5 13.12345 2 4.9 3.15 2.1345 ; DATA WORK.TestDataSet; SET WORK.TestDataSet; MSE = (Target - ((Col1 + Col2) + exp(2.345)))**2; RUN; proc summary data = WORK.TestDataSet ; var MSE; output out = WORK.ProcSumOut sum=; run; proc print data = WORK.ProcSumOut; run; data WORK.ProcSumOut2; set WORK.ProcSumOut ; MSE=MSE/_FREQ_; run; proc print data = WORK.ProcSumOut2; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to csetzkorn

12-18-2016 01:20 PM

proc means data = WORK.TestDataSet ;

var MSE;

output out = WORK.ProcSumOut sum= Mean= / autoname;

run;

Solution

12-19-2016
09:32 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to csetzkorn

12-18-2016 05:41 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to csetzkorn

12-18-2016 09:34 AM

If for some reason you want what you are asking for, it's already there. The output data set automatically contains _FREQ_. It is the total number of observations (including those that have missing values, so using it in calculations may not give you the mean exactly).

You are specifying that the output data set should contain the SUM. You can also ask for other statistics, such as MEAN (as Reeza mentioned), N (number of observations with a non-missing value), and NMISS (number of observations with a missing value).