Help using Base SAS procedures

Means Sum Statistic

Reply
Contributor
Posts: 73

Means Sum Statistic

Hello

I am attempting to include the sum statistic with n min max std mean.
Unfortunately whatever i try appears in output window and not in the output dataset.


How can i modify the below code so it appears in score.TVMean dataset?

proc means data = score.TV;
output out = score.TVMean;
run;

Thanks

Fred
Respected Advisor
Posts: 3,777

Re: Means Sum Statistic

The default output data has only N, MEAN, STD, MIN and MAX and cannot be changed as far as I know. When I want output like I think you want I just do this...

It works well enough.

[pre]
proc means noprint data = sashelp.class;
output out=stats;
output out=sum sum=;
run;
data stats;
set stats sum(in=in2);
if in2 then _STAT_ = 'SUM';
run;
proc print data=stats;
run;
[/pre]
Contributor
Posts: 73

Re: Means Sum Statistic

Thanks very much
SAS Super FREQ
Posts: 8,743

Re: Means Sum Statistic

Hi:
You need to explicitly name the statistics in your OUTPUT statement. You can list the statistics on the PROC statement, and then refer to them in the OUTPUT statement. Like this:
[pre]
proc means noprint data = sashelp.class n sum mean;
var age height;
class sex;
output out=stats n=acnt hcnt
sum=asum hsum
mean=aavg havg;
run;

proc print data=stats;
run;
[/pre]

and what you will get in the output is:
[pre]
Obs Sex _TYPE_ _FREQ_ acnt hcnt asum hsum aavg havg

1 0 19 19 19 253 1184.4 13.3158 62.3368
2 F 1 9 9 9 119 545.3 13.2222 60.5889
3 M 1 10 10 10 134 639.1 13.4000 63.9100

[/pre]

cynthia
Contributor
Posts: 73

Re: Means Sum Statistic

Thanks Cynthia
Contributor
Posts: 73

Re: Means Sum Statistic

Hi Cynthia

What part does class sex; play?

Fred
Contributor
Posts: 73

Re: Means Sum Statistic

Hi Cynthia



Hi Cynthia

Output becomes very confusing do to loss of var names, i will have 15.
How can i retain the column headings and control if i want mean for var or sum for var?


_NAME_ _TYPE_ _FREQ_ aavg haavg asum hsum
0 6 922660.25779695.5 11535961.54678173
Apr_01 1 1 923775 743218 1923775 743218
Feb_01 1 1 922003.5 764843 1922003.5 764843
Jan_01 1 1 916867.5 817058 1916867.5 817058
Jun_01 1 1 925262 758343 1925262 758343
Mar_01 1 1 923923 814465 1923923 814465
May_01 1 1 924130.5 780246 1924130.5 780246


Thanks

Fred
Valued Guide
Posts: 2,175

Re: Means Sum Statistic

Hi Fred
packaging the process, Myra and I presented (http://www2.sas.com/proceedings/sugi31/059-31.pdf) a few years ago, a solution to your problem, with a macro

The output would be something like
column name, _freq_, statistic1 .... statisticN
where the columns names to be analysed are passed into the macro as a list, or
use "varlst = _ALL_" to analyse all numeric variables
The list of statistics required, was passed as another parameter, like stts = "n min p1 p5 p10 p25 p50 mean p75 p90 p95 p99 max std "
or simplify things with "stts = _ALL_" which will report all available statistics as documented (at that time) in online help for the OUTPUT statement of PROC MEANS.
A hint of the power flexibility and simplicity of the macro might be indicated by the header of the macro:
%macro better_means(
data = &syslast ,
out = ,
print = Y,
sort = VARNUM,
stts = _ALL_,
varlst = _ALL_,
clss = ,
wghts = ,
testing= no , /* any other value will preserve the _better_: data sets */
/* default list of statistics when _all_ requested */
_stts = N MEAN STD MIN MAX CSS CV LCLM NMISS
P1 P5 P10 P25 P50 P75 P90 P95 P99 QRANGE RANGE
PROBT STDERR SUM SUMWGT KURT SKEW T UCLM USS VAR
);

You might notice the CLSS= parameter. This allows you to supply the list of variables to appear on a class statement. Similarly there is a parameter to define WEIGHT.
Unfortunately, the macro stored in the PDF referred above, suffered some publishing problems and a corrected version of the macro can be found in the sasCommunity.org site at http://www.sascommunity.org/wiki/PROC_MEANS_-_Improve_on_the_default

peterC
Respected Advisor
Posts: 3,777

Re: Means Sum Statistic

I believe you have a bug with regards to CLM statistics. The tails of the CLM statistics are determined by they way you "ask" for them. For one-tail test request just one UCLM or LCLM for two-tailed test you request both.

I believe you coded something similar to this in your macro.
[pre]
output out=LCLM LCLM=;
output out=UCLM UCML=;
[/pre]

you get one-tailed CL.

But if you ask for both you really wanted two-tailed CL.

I use dummy variables to achieve the desired result.

[pre]
* Two-tail CLM;
proc means data=sashelp.class lclm uclm;
run;

The MEANS Procedure

Lower 95% Upper 95%
Variable CL for Mean CL for Mean
----------------------------------------
Age 12.5963445 14.0352344
Height 59.8656709 64.8080133
Weight 89.0496312 111.0030004
----------------------------------------



* Incorrect one-tailed but should be two-tailed as both are requested;
%inc 'better_means.sas';
%better_means(stts=lclm uclm,data=sashelp.class);

Obs VARNUM NAME LCLM UCLM

1 3 Age 12.7220 13.910
2 4 Height 60.2972 64.377
3 5 Weight 90.9664 109.086


* Correct two-tailed CLM;
proc means noprint data=sashelp.class;
output out=LCLM(drop=_dummySmiley Happy LCLM= UCLM=_dummy1-_dummy3;
output out=UCLM(drop=_dummySmiley Happy UCLM= LCLM=_dummy1-_dummy3;
run;
data CLM;
set LCLM(in=in1) UCLM(in=in2);
length _STAT_ $8;
if in1 then _STAT_='LCLM';
else if in2 then _STAT_='UCLM';
run;
proc print;
run;

Obs _TYPE_ _FREQ_ Age Height Weight _STAT

1 0 19 12.5963 59.8657 89.050 LCLM
2 0 19 14.0352 64.8080 111.003 UCLM

[/pre]
SAS Super FREQ
Posts: 8,743

Re: Means Sum Statistic

Hi:
I -think- I understand what you mean when you say
mean for var or sum for var -- I think you're asking what if you want mean for AGE and SUM for HEIGHT?? If so, then try the following amendment to the first code I posted. You can control the specific name that's assigned and ONLY get a statistic for 1 variable instead of for all the variables in the VAR statement by coding a slightly different construction on the OUTPUT statement...generally:
stat(varname)=newvarname. So in the example below, I am getting the N for AGE, the SUM for HEIGHT, the MEAN for AGE and the MEDIAN value for both numbers with the default names for the remaining 2 variables.

Consider this code:
[pre]
proc means noprint data = sashelp.class n sum mean median;
var age height;
class sex;
output out=stats2 n(age)=agecnt
sum(height)=htsum
mean(age)=ageavg
median=;
run;

proc print data=stats2;
title '2) Control names selectively and only get some statistics';
title2 'N for AGE, only; SUM for HEIGHT only; MEAN for AGE, only; default names for MEDIAN';
run;
[/pre]

Produces this output:
[pre]
2) Control names selectively and only get some statistics
N for AGE, only; SUM for HEIGHT only; MEAN for AGE, only; default nam

Obs Sex _TYPE_ _FREQ_ agecnt htsum ageavg Age Height

1 0 19 19 1184.4 13.3158 13.0 62.80
2 F 1 9 9 545.3 13.2222 13.0 62.50
3 M 1 10 10 639.1 13.4000 13.5 64.15

[/pre]

There's another example in the doc here:
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#/documentation/cdl/en...

Also you asked what the CLASS statement was doing...the CLASS statement is telling PROC MEANS that I want the statistics for all the rows in my DATA= file (_TYPE_ = 0) and that I want a separate row of statistics for SEX=F (_TYPE_=1) and a separate row of statistics for SEX=M (_TYPE_ = 1).

Using a CLASS statement allows me to set categories, in much the same way that the CLASS statement is used for PROC TABULATE -- it sets groups or categories -- so that the statistics can be calculated for the groups separate
from the statistics for the whole dataset. For example, you can see that the average age for the whole group of 19 observations is 13.3158, but that the 9 females have an average age of 13.2222; while the average age for the 10 males is 13.40 -- that particular feature may not be useful to you right now, but is handy to know about. (Without the CLASS statement, you would only get the _TYPE_=0 row in the output dataset.)

cynthia
Ask a Question
Discussion stats
  • 9 replies
  • 229 views
  • 0 likes
  • 4 in conversation