Help using Base SAS procedures

PROC TABULATE Sum statistic

Reply
N/A
Posts: 0

PROC TABULATE Sum statistic

Hello,

I have just noticed that I was doing something wrong with my PROC TABULATE.

Basically, I would like the newly created variable "Rate" to calculate predict /actual column by column (INCLUDING the TOTAL column). It does it fine until the TOTAL column, where SAS sums up all the Rates instead of dividing predict/actual. If you run the code, you'll find out that the rate for Canada 1997 = 5.31 but I woul like it to read 1.06 (1,889,464 / 1,781,102).

Is there another statistic that could help me with this? See my code below :

[pre]
PROC TABULATE
DATA=SASHELP.PRDSAL3
OUT=WORK.SALE1;

TITLE;
TITLE1 "Sale #1";

VAR predict actual;
CLASS country / ORDER=UNFORMATTED MISSING;
CLASS year / ORDER=UNFORMATTED MISSING;
CLASS product / ORDER=UNFORMATTED MISSING;
TABLE

/* Row Dimension */
year={LABEL=''}*(product={LABEL=''} ALL={LABEL='TOTAL'}),

/* Column Dimension */
country*(predict*sum={label=''} actual*Sum={LABEL=''})


/ BOX={LABEL=_PAGE_} PRINTMISS MISSTEXT='0' INDENT=0 ;
;
RUN;



DATA WORK.SALE2;
SET WORK.SALE1;

Rate=predict_sum/actual_sum;

RUN;



PROC TABULATE
DATA=WORK.SALE2;

TITLE1 "Sale #3";

VAR predict_sum actual_sum rate;
CLASS country / ORDER=UNFORMATTED MISSING;
CLASS year / ORDER=UNFORMATTED MISSING;
CLASS product / ORDER=UNFORMATTED MISSING;
TABLE

/* Row Dimension */
year={LABEL=''}*(product={LABEL=''} ALL={LABEL='TOTAL'}),

/* Column Dimension */
country*(predict_sum*sum={label=''} actual_sum*Sum={LABEL=''} rate*sum={LABEL=''})


/ BOX={LABEL=_PAGE_} PRINTMISS MISSTEXT='0' INDENT=0 ;
;
RUN;
[/pre]

Edit : I'm sorry code looks funny...I took the time to format it all in SAS, but apparently the [ pre ] option does not like it...hope it is still readable. Message was edited by: SamuelG.
SAS Super FREQ
Posts: 8,742

Re: PROC TABULATE Sum statistic

Hi:
Even though you calculate RATE in a DATA step, by the time PROC TABULATE gets RATE, there is no longer any visibility or awareness of RATE's underlying division when TABULATE builds your report #3. You ask for RATE*SUM in your report #3 and that's what it sounds like TABULATE is giving you.

I don't believe that TABULATE is doing anything wrong -- it's giving you the statistics you asked for.

Although you can use PCTSUM with a custom denominator in TABULATE, I doubt that, in this circumstance, it will give you the number you want -- on just the one row. You've already calculate the RATE for the other rows and there's not a good way to get only the Total row to be the result of the division. You're using 2 TABULATES and a DATA step -- essentially reading and processing the same data 3 times. You could generate your report in 1 PROC REPORT step with COUNTRY as an ACROSS variable and using a COMPUTED report item for a calculate RATE item.

The program below shows two different versions of a PROC REPORT -- one which shows the year total at the bottom of the products (1) and another which shows the year total at the top of the products (2).

cynthia

[pre]
ods html file='c:\temp\tworep.html' style=egdefault;

proc report data=sashelp.prdsal3 nowd nocenter;
column year product country,(predict actual rate);
TITLE1 "1) Sale and Rate with PROC REPORT and Compute AFTER";
define year / group noprint;
define product / group;
define country / across;
define predict / sum f=comma15.2;
define actual / sum f=comma15.2;
define rate / computed f=7.2;
compute rate;
_c5_ = _c3_ / _c4_;
_c8_ = _c6_ / _c7_;
_c11_ = _c9_ / _c10_;
endcomp;
break after year / summarize;
compute after year;
product = put(year,4.)||' '||'Total';
line ' ';
endcomp;
run;

proc report data=sashelp.prdsal3 nowd nocenter;
column year product country,(predict actual rate);
TITLE1 "2) Sale and Rate with PROC REPORT and Compute BEFORE";
define year / group noprint;
define product / group;
define country / across;
define predict / sum f=comma15.2;
define actual / sum f=comma15.2;
define rate / computed f=7.2;
compute rate;
_c5_ = _c3_ / _c4_;
_c8_ = _c6_ / _c7_;
_c11_ = _c9_ / _c10_;
endcomp;
break before year / summarize;
compute before year;
product = put(year,4.)||' '||'Total';
endcomp;
compute after year;
line ' ' ;
endcomp;
run;
ods html close;

[/pre]
N/A
Posts: 0

Re: PROC TABULATE Sum statistic

Cynthia,

Sadly, it' even more complicated than that...Maybe there's a better way to achieve what I want to do, but I went with things I was comfortable with.The reason I'm using DATA statements is that I want to create conditional variables.

Let me give you a more complete example of what I want...It's the actual code I'm using, but adapted for PRDSAL3 dataset. As you'll see, I'm creating another variable (FINAL_INDEX), which is dependant on the country.

Here's the code:

[pre]

ODS _ALL_ close;
PROC TABULATE
DATA=SASHELP.PRDSAL3
OUT=WORK.Summarytable1;

TITLE;
TITLE1 "Summary table 1";

VAR predict actual;
CLASS country / ORDER=UNFORMATTED MISSING;
CLASS year / ORDER=UNFORMATTED MISSING;
CLASS product / ORDER=UNFORMATTED MISSING;
TABLE

/* Row Dimension */
year={LABEL=''}*(product={LABEL=''}),

/* Column Dimension */
country*(predict*sum={label=''} actual*Sum={LABEL=''})

/ BOX={LABEL=_PAGE_} PRINTMISS MISSTEXT='0' INDENT=0 ;
RUN;
DATA WORK.SALE2;
SET WORK.Summarytable1;

IF country = 'Canada' THEN DO;
IF actual_Sum=0 THEN Rate_Can=.;
IF actual_Sum>0 THEN Rate_Can=predict_sum/actual_sum;
END;
IF country = 'Mexico' THEN DO;
IF actual_Sum=0 THEN Rate_Mex=.;
IF actual_Sum>0 THEN Rate_Mex=predict_sum/actual_sum;
END;
IF country = 'U.S.A.' THEN DO;
IF actual_Sum=0 THEN Rate_USA=.;
IF actual_Sum>0 THEN Rate_USA=predict_sum/actual_sum;
END;
RUN;
PROC TABULATE
DATA=WORK.SALE2
OUT=WORK.Summarytable2;

TITLE1 "Summary table 2";

VAR predict_sum actual_sum rate_Can Rate_Mex Rate_USA;
CLASS country / ORDER=UNFORMATTED MISSING;
CLASS year / ORDER=UNFORMATTED MISSING;
CLASS product / ORDER=UNFORMATTED MISSING;
TABLE

/* Row Dimension */
year={LABEL=''}*(product={LABEL=''}),

/* Column Dimension */
country*
(predict_sum*sum={label=''}
actual_sum*Sum={LABEL=''}
rate_Can*sum={LABEL=''}
rate_mex*sum={LABEL=''}
rate_USA*sum={LABEL=''})

/ BOX={LABEL=_PAGE_} PRINTMISS MISSTEXT='0' INDENT=0 ;
RUN;
DATA WORK.SALE4;
SET WORK.Summarytable2;

IF country='Canada' THEN
key = year!!product;

/***** INDEX *****/

INDEX_CAN = Rate_Can_Sum;
RUN;
Proc Sort Data=WORK.SALE4;
by key;
RUN;
DATA WORK.SALE5;
SET WORK.Summarytable2;
key = year!!product;
/***** CALCULATION OF RATE *****/
FINAL_RATE=Sum(Rate_can_Sum, Rate_Mex_Sum, Rate_USA_Sum);
RUN;
PROC SORT Data=WORK.SALE5;
by key;
RUN;

DATA WORK.SALE6;
MERGE WORK.SALE4 (IN=A)
WORK.SALE5 (IN=B);
BY key;
IF B;
RUN;
DATA WORK.SALE7;
SET WORK.SALE6;
IF INDEX_CAN = 0 THEN FINAL_INDEX=.;
ELSE IF INDEX_CAN >0 THEN DO;
IF FINAL_RATE=>0 THEN FINAL_INDEX=FINAL_RATE/INDEX_CAN;
END;
RUN;
ODS html file='c:\temp\thisismyfinaltable.html' style=egdefault;
PROC TABULATE
DATA = WORK.SALE7;

TITLE1 "FINAL Summary Table";

VAR predict_sum_sum actual_sum_sum FINAL_RATE FINAL_INDEX;

CLASS country / ORDER=UNFORMATTED MISSING;
CLASS year / ORDER=UNFORMATTED MISSING;
CLASS product / ORDER=UNFORMATTED MISSING;

TABLE

/* Row Dimension */
year={LABEL=''}*(product={LABEL=''} ALL={LABEL='TOTAL'}),

/* Column Dimension */
country*
(predict_sum_sum*sum={label=''}
actual_sum_sum*Sum={LABEL=''}
FINAL_RATE*sum={LABEL=''}
final_index*sum={LABEL=''})

/ BOX={LABEL=_PAGE_} PRINTMISS MISSTEXT='0' INDENT=0 ;
RUN;

[/pre]

Thank you for your help Smiley Happy
Super User
Posts: 10,483

Re: PROC TABULATE Sum statistic

SamuelG

If I understand what you are attempting, I would be very tempted to summarize the data using proc summary with the class variables to get the sums for each group and then run the resulting data set through a data step to calculate the rates. Then the _type_ variable can be selected and/or formatted to show the various groups you need. The country/year/product combinations that don't actually have any data to sum could be handled in Tabulate with a preloadfmt option to display blank rows or columns as needed.

proc summary data=sashelp.prdsal3;
class country year product;
var predict actual;
output out=prodsum sum=;
run;

data prodrate;
set prodsum;
rate=predict/actual;
if _type_=6 and product='' then product='All'; /* add a useful description to display below*/
run;

/* this is using tabulate just arrange precalculated values*/
/* shows year and individual products or all products */
proc tabulate data=prodrate;
where _type_ in (6,7);
class country year product/order=unformatted missing;
var predict actual rate;
table year=' '*product=' ',
country*max=' '*(predict actual rate)
/printmiss misstext='0';
run;
Ask a Question
Discussion stats
  • 3 replies
  • 150 views
  • 0 likes
  • 3 in conversation