Help using Base SAS procedures

trying to convert a TABULATE statement to REPORT

Super Contributor
Posts: 286

trying to convert a TABULATE statement to REPORT

just now trying to teach myself the basics of PROC REPORT.

proc tabulate data=m_match;
class letter crstitle level;
table crstitle='COURSE'*letter='COURSE GRADE' all='TOTAL'*(f=6.0), level='EOCEP GRADE'*(n*f=6.0 pctn='PCT'*f=6.0)
all='TOTAL'*(f=6.0) / rts=25 box=' Scores 2008-2009';
format schnum $schfmt.;
Posts: 9,370

Re: trying to convert a TABULATE statement to REPORT

Wow -- big question. Basically, in PROC TABULATE, everything happens, very succinctly and elegantly in the TABLE statement, based on the usage (CLASS or VAR) that you specify for a variable. Table operators, such as ',' (comma) determine which variables will be in the ROW dimension and which variables will be in the COLUMN dimension. And, other table operators, like '*' (asterisk) or ' ' (blank) control nesting or stacking of rows and columns in the table. Further, the use of the universal class variable ALL give you the opportunity to get summary information in any dimension, page, row or column. Many people like PROC TABULATE for it's very elegant and sparse, almost mathemetical syntax for describing table dimensions, variables and statistics.

PROC REPORT, on the other hand, is very verbose. That's partly due to the features that it has, which cannot be achieved with any other procedure. PROC REPORT can do detail (or lising reports) like PROC PRINT. PROC REPORT can do summary reports, like PROC MEANS or PROC TABULATE. PROC REPORT can do cross-tabular reports, like PROC TABULATE. In addition, PROC REPORT has COMPUTE block capability, which allows you to compute report items as the report is being built, without making a separate pass through the data to create a new variable.

So, rather than take your table and try to replicate it entirely, let's take a simple example and show the different syntaxes for 1 simple report. You will notice some stylistic differences...colors and fonts, but those would be easily fixed using STYLE= syntax.

ods html file='c:\temp\compare.html' style=sasweb;
proc tabulate data=sashelp.prdsale f=comma12.;
title 'Proc Tabulate';
class region country division;
var actual;
table division*region all='Total',
country*actual=' ' all='Actual Total'*actual=' ';
keylabel sum=' ';

proc report data=sashelp.prdsale nowd;
title 'Proc Report';
column division region actual,country ('Actual' actual=grandtot) ('Projected at' projected);
define division / group;
define region / group;
define country / across;
define actual / sum ' ' f=comma12.;
define grandtot / 'Total' f=comma12.;
define projected / computed '5% Growth' f=comma12.;
compute projected;
projected=grandtot * 1.05;
rbreak after /summarize;
compute after;
division = 'Total';
ods html close;

The default assumption of PROC TABULATE is that you are going to build a SUMMARY report. So all that is needed to get summary rows for region and division are their usage in a CLASS statement in TABULATE. Whereas, for PROC REPORT, you have to explicitly DEFINE that those 2 variables will be used as GROUP items on the report.

PROC TABULATE uses the country*actual in the TABLE statement to make a unique column for every country. PROC REPORT on the other hand needs 2 things to make the crosstabular report work: 1) in the COLUMN statement actual,country alerts PROC REPORT that you will be using COUNTRY as an ACROSS item and using ACTUAL for the values of the cells underneath each country. And, 2) in the DEFINE statement for COUNTRY, it is still necessary to explicitly list ACROSS as the usage for COUNTRY -- that's because an ACROSS usage is not one of the default usages and the COLUMN statement and the DEFINE statement work together to make a unique column for every country and to use the sum of ACTUAL in every cell.

Next, TABULATE uses ALL in the ROW dimension and in the COL dimension to get the TOTAL row and the TOTAL column. PROC REPORT uses the RBREAK AFTER statement to get the TOTAL row at the bottom of the report. But REPORT uses the aliased report item: actual=grandtot to make a TOTAL column on the far right side of the report row. And, the GRANDTOT report item is being labelled and formatted in the DEFINE statement. This allows you to use no label for the first usage of ACTUAL on the report (streamlining the report headers) and use the label 'Total' for the second usage of ACTUAL on the report (at the far right). TABULATE allows you to label ALL as 'Total' in the TABLE statement for both the ROW and the COL totals (or you could have used a KEYLABEL statement).

However, for PROC REPORT, you label the far right column as 'Total' on the DEFINE statement. (The spanning header 'Actual' is put on the report in the COLUMN statement) To label the summary line at the bottom of the report (the RBREAK AFTER line), you need a COMPUTE block that will assign the value 'Total' to the DIVISION cell on that report row.

Other differences that you will note...TABULATE has a big fat empty box on the upper left of the result table and the row headers for DIVISION and REGION are the same color as the column headers for COUNTRY. PROC REPORT doesn't treat the values for DIVISION and REGION the same way, by default. This is changeable, but not shown here.

You can't get rid of the BOX area in a TABULATE report...all you can do is put something there. Notice that PROC REPORT does not have a BOX AREA -- you are laying your report row out from LEFT to RIGHT according to the position of the items in the COLUMN statement -- thus there is no box area to worry about.

The BIG thing that PROC REPORT can do is what you see in the PROJECTED column on the report. The data does NOT have a variable named PROJECTED. PROJECTED is defined with a usage of COMPUTED -- which means that for every report row, PROJECTED will be calculated by multiplying the GRANDTOT item by 1.05. TABULATE cannot do this. PRINT cannot do this. MEANS cannot do this.

That's just the tip of the PROC REPORT iceberg. It is unfair to PROC REPORT (in my opinion) to compare it to TABULATE or vice versa. PROC TABULATE is the "Acme Data Slicer and Dicer Extraordinaire". PROC REPORT is like using LEGO(r) interlocking blocks -- lots of different possibilities to what you can build, and how and not just all summary and not just all crosstabular. Sure, you can make the same looking tables from either one, but the same results are achieved by different methods.

The documentation is the best place to start with PROC REPORT. I've listed some other papers and articles that might be of interest: (originally written for SAS 6, but still one of the best references around. Before this PDF was available, I used to make folks sign a sheet when they "borrowed" my copy of P-258 out of my office.)

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation