## Column totals

Solved
Frequent Contributor
Posts: 124

# Column totals

I am attempting to create column totals for a few variables in a dataset. I want to store these totals in separate variables so I can use them in a new data step. I understand proc print has a sum statement that computes the total, but it doesn't store that total in a variable.

Accepted Solutions
Solution
‎03-08-2016 01:46 PM
Posts: 1,243

## Re: Column totals

[ Edited ]

PROC SUMMARY is more suitable for this purpose than PROC PRINT.

Edit:

Example:

proc summary data=sashelp.class;
var age weight height;
output out=totals sum=;
run;

The above step computes the totals of variables AGE, WEIGHT and HEIGHT (as listed in the VAR statement) across all observations in dataset SASHELP.CLASS and stores the results in variables of the same names in a dataset which I named TOTALS (see OUTPUT statement). The totals are requested by the SUM= option of the OUTPUT statement. (After the equals sign you could specify different variable names for the totals, if desired, e.g., sum=tot_age tot_wgt tot_hgt.)

All Replies
Solution
‎03-08-2016 01:46 PM
Posts: 1,243

## Re: Column totals

[ Edited ]

PROC SUMMARY is more suitable for this purpose than PROC PRINT.

Edit:

Example:

proc summary data=sashelp.class;
var age weight height;
output out=totals sum=;
run;

The above step computes the totals of variables AGE, WEIGHT and HEIGHT (as listed in the VAR statement) across all observations in dataset SASHELP.CLASS and stores the results in variables of the same names in a dataset which I named TOTALS (see OUTPUT statement). The totals are requested by the SUM= option of the OUTPUT statement. (After the equals sign you could specify different variable names for the totals, if desired, e.g., sum=tot_age tot_wgt tot_hgt.)

Frequent Contributor
Posts: 124

## Re: Column totals

Thank you, this was very helpful. I have one more question going off of this. Take a look at my code:

data Total_Savings_2015_prep;
if SPECIALTY = 'N' and BRAND_GENERIC = 'Brand' and ApprovedPriceType = 'AWP';
run;

proc summary data=Total_Savings_2015_prep;
output out=totals_sum_2015 sum=ING DF RX;
run;

proc summary data=Total_Savings_2015_prep;
var AWP_Rate ApprovedDispensingFeeAmount;
run;

data Total_Savings_2015;
set totals_sum_2015;
set totals_mean_2015;
run;

You'll notice in the first data step, I am subsetting for certain values of different variables. I would like these variables to show up in the final dataset so that my output would look something like this:

SPECIALTY      BRAND_GENERIC      ApprovedPriceType      ING        DF        RX      Avg_AWP_Rate      Avg_ADFA

N                        Brand                              AWP             \$48,923     \$2.00      32               18%                     \$2.00

The data in blue is what I already have in that last dataset.

Posts: 1,243

## Re: Column totals

You're welcome.

You could create Total_Savings_2015 (incl. the desired additional variables) in one step from EGTASK.FL_INSTATE_SAVINGS_2015:

where SPECIALTY = 'N' and BRAND_GENERIC = 'Brand' and ApprovedPriceType = 'AWP';
id SPECIALTY BRAND_GENERIC ApprovedPriceType;
var SUM_OF_APPROVEDINGREDIENTCOST SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_2015(drop=_:)
run;

The WHERE statement restricts the summary statistics to the subset of observations which you called Total_Savings_2015_prep. The ID statement brings in the additional variables (actually the "maximum values" of these, but this is necessarily the single combination selected by the WHERE statement). The VAR statement lists all analysis variables. They are assigned to the two different statistics in the OUTPUT statement. Finally, the DROP= dataset option drops all variables whose names start with an underscore, i.e. the two default variables _TYPE_ and _FREQ_, from the output dataset (assuming you don't need them).

Frequent Contributor
Posts: 124

## Re: Column totals

Say I have multiple proc summaries with different criteria in the where statement to compute, like this:

where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and Specialty_Flag = 'Non-Specialty' and BRAND_GENERIC = 'Brand' and ApprovedPriceType = 'AWP';
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FL_2015(drop=_:)
run;

where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and Specialty_Flag = 'Non-Specialty' and BRAND_GENERIC = 'Generic' and ApprovedPriceType = 'AWP';
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FL_2015_2(drop=_:)
run;

where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and Specialty_Flag = 'Non-Specialty' and BRAND_GENERIC = 'Generic' and index(ApprovedPriceType,'MAC') > 0;
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FL_2015_3(drop=_:)
run;

Would there be a more efficient way to cycle through the different cases (such as using a macro or a loop)?

Posts: 1,243

## Re: Column totals

Yes, sure. You could use a macro to make the varying parts of the program code flexible. I'd suggest to make also the year flexible (so you can reuse the code until 2099 :-)).

%macro summ(yr=, crit=, suffix=);
where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and Specialty_Flag = 'Non-Specialty' and &crit;
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FL_20&yr.&suffix(drop=_:)
run;
%mend summ;

To execute the original three PROC SUMMARY steps, you just call the macro with the appropriate parameters:

%summ(yr=15, crit=BRAND_GENERIC = 'Brand' and ApprovedPriceType = 'AWP')
%summ(yr=15, crit=BRAND_GENERIC = 'Generic' and ApprovedPriceType = 'AWP', suffix=_2)
%summ(yr=15, crit=BRAND_GENERIC = 'Generic' and index(ApprovedPriceType,'MAC'), suffix=_3)

The first macro call doesn't specify parameter SUFFIX. As no default value for this parameter has been defined, nothing is appended to the output dataset name (as desired).

In the third macro call I have removed the "> 0" just for abbreviation, because it is redundant.

For debugging purposes you can use system option MPRINT to see the program code generated by a macro call in the log.

Frequent Contributor
Posts: 124

## Re: Column totals

Thank you, much appreciated.

Frequent Contributor
Posts: 124

## Re: Column totals

Okay. so I have a new variation of things, with slight differences:

where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and Specialty_Flag = 'Non-Specialty' and BRAND_GENERIC = 'Generic' and index(ApprovedPriceType,'MAC') > 0;
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FLv5_2015_3(drop=_:)
run;

where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and Specialty_Flag = 'Specialty' and BRAND_GENERIC = 'Generic' and index(ApprovedPriceType,'MAC') > 0;
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FLv5_2015_6(drop=_:)
run;

I've attempted to write a macro for these two and it looks like this (the main new change is that the output dataset name now has an "infix", i.e. "v5"), so I added a macro variable called infix:

%macro summ(yr=, crit=, infix=, suffix=);
where Business_Line = 'Commercial' and Channel = 'Retail' and CLIENT = 'FL' and NETWORK = 'FL INSTATE' and BRAND_GENERIC = 'Generic' and index(ApprovedPriceType,'MAC') > 0 and &crit;
id Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;
var ApprovedIngredientCost SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;
output out=Total_Savings_FL&infix_20&yr.&suffix(drop=_:)
run;
%mend summ;

%summ(yr=15, crit=Specialty_Flag = 'Non-Specialty', infix=v5)
%summ(yr=15, crit=Specialty_Flag = 'Specialty', infix=v5, suffix=_2)

However, when I run it, I get an error on the &infix statement:

Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;   var ApprovedIngredientCost
38       ! SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;   output out=Total_Savings_FL&infix_20&yr.&suffix(drop=_:)
_
22
38       !        sum
ERROR 22-322: Syntax error, expecting one of the following: ;, (, /, CSS, CV, IDGROUP, IDGRP, KURTOSIS, LCLM, MAX, MAXID, MEAN,
MEDIAN, MIN, MINID, MODE, N, NMISS, OUT, P1, P10, P20, P25, P30, P40, P5, P50, P60, P70, P75, P80, P90, P95, P99,
PROBT, Q1, Q3, QRANGE, RANGE, SKEWNESS, STDDEV, STDERR, SUM, SUMWGT, T, UCLM, USS, VAR.

NOTE: Line generated by the invoked macro "SUMM".
38          Business_Line Channel CLIENT NETWORK Specialty_Flag BRAND_GENERIC ApprovedPriceType;   var ApprovedIngredientCost
38       ! SUM_OF_ADFA SUM_OF_RX AWP_Rate ApprovedDispensingFeeAmount;   output out=Total_Savings_FL&infix_20&yr.&suffix(drop=_:)
_
76
38       !        sum
ERROR 76-322: Syntax error, statement will be ignored.

My guess is that I'm missing something simple here.

Thanks,

JediApprentice

Posts: 1,243