Help using Base SAS procedures

Using Proc Tabulate

Reply
N/A
Posts: 0

Using Proc Tabulate

I inherited the following program but want to modify it so that I can have
an output line listing per school with the following information on each line:

coname type distname schname total #PBE %PBE.

How can I do that?

title2 'TABLE 1: TOTAL ENROLLMENT AND STATUS';
proc tabulate data=kasch_update noseps;
class coname type distname schname;
var total pbe;
format type $type.;
table coname='COUNTY:'*type='SCHOOL TYPE:'*distname='SCHOOL DISTRICT:',
schname='SCHOOL NAME',
total*f=comma9.
pbe*f=comma9.*(sum pctsum*f=comma9.2)
/ row=float rtspace=50;
label
total = 'TOTAL PEOPLE'
pbe = 'ENTRANTS WITH PBE';
keylabel sum = 'NUMBER' pctsum = 'PERCENT';
run;

Message was edited by: Teresa
SAS Super FREQ
Posts: 8,868

Re: Using Proc Tabulate

Posted in reply to deleted_user
Hi, Teresa:
I'm not sure what you mean by an "output line listing" -- do you mean you do NOT want a PROC TABULATE table?

Currently, it looks like you have a 3 dimensional table:
[pre]
table
coname='COUNTY:'*type='SCHOOL TYPE:'*distname='SCHOOL DISTRICT:', /* PAGE dimension */
schname='SCHOOL NAME', /* ROW dimension */
total*f=comma9. pbe*f=comma9.*(sum pctsum*f=comma9.2) /* COL dimension */
/ row=float rtspace=50;

[/pre]

Or do you mean that you want a different kind of report from Proc Tabulate, perhaps a report without the PAGE dimension. right now, it looks like you have a separate logical page for every unique combination of county, school type, and district. Then you should have school name going down the rows with total and the sum of PBE and PCTSUM of PBE going across the columns.

Proc Tabulate uses table operators to arrange the structure of the table. So, every comma (,) means start a new dimension. If you have 2 commas in your table statement, then you would have 3 dimensions -- page, row and column. If you have 1 comma in your table statement, then you would have 2 dimensions, row and column. And if you have no commas in your table statement, then you would have only the column dimension in your table. The asterisk operator (*) tells Proc Tabulate to "nest" or "stack" columns...

So, if you wanted to simply eliminate the PAGE dimension, you could change the comma after SCHOOL DISTRICT to an asterisk:
[pre]
table coname='COUNTY:'*type='SCHOOL TYPE:'*distname='SCHOOL DISTRICT:' * schname='SCHOOL NAME',
total*f=comma9. pbe*f=comma9.*(sum pctsum*f=comma9.2)
/ row=float rtspace=50;

[/pre]

However, that would leave school nested within district nested within type nested within county. If you mean you want a simple proc print listing, then you
could just run a PROC PRINT with the appropriate VAR statement -- but PROC PRINT will not calculate summaries and/or percents for you.

If you still want a summary report -- only "flattened", then you might try PROC REPORT with the GROUP usage for county, type, district and school name. However, since you are using a special denominator in TABULATE, you may need to get into issues of using COMPUTE blocks with PROC REPORT. For that, you might be better off contacting Tech Support for help.

Yet another option is to have PROC TABULATE create an output data set for you by modifying the PROC TABULATE statement:
[pre]
proc tabulate data=kasch_update noseps out=work.tabout;

[/pre]

And in this case, a simple PROC PRINT on WORK.TABOUT may be what you want. You will probably have to rename some of the calculated variables in the WORK.TABOUT dataset because PROC TAB does not use your LABEL statement to name the variables.TABULATE also adds some automatic variables to the output dataset _TYPE_, _PAGE_ and _TABLE_ that are explained in the TABULATE documentation, but you can pretty much figure out what they mean by comparing your TABLE statement to the output dataset.

cynthia
N/A
Posts: 0

Re: Using Proc Tabulate

Posted in reply to Cynthia_sas
Hi Cynthia:

Thanks so very much! I appreciate your time. You are a great help!

teresa
SAS Super FREQ
Posts: 8,868

Re: Using Proc Tabulate

Posted in reply to deleted_user
Hi, Teresa:
PROC TABULATE is very cool. I find that it does help to work backwards from a picture of the table I want to the table statement syntax.

cynthia
N/A
Posts: 0

Re: Using Proc Tabulate

Posted in reply to Cynthia_sas
Hi Cynthia:

Thanks so much!!

teresa
N/A
Posts: 0

Re: Using Proc Tabulate

Posted in reply to deleted_user
How I can determine which observation is missing after I run Proc tabulate? The original dataset has one more observation than the dataset that Proc tabulate outputs. Thanks,
SAS Super FREQ
Posts: 8,868

Re: Using Proc Tabulate

Posted in reply to deleted_user
Hi:
Remember that PROC TABULATE is a SUMMARY procedure. It summarizes your data according to the class variables you give it. What if I have this file:
[pre]
grp subgrp amt
aa ww 1
aa ww 2
bb xx 3
cc yy 4
dd zz 5
[/pre]

It has 5 observations. The way PROC TABULATE would work is that if I ran this PROC TABULATE program:
[pre]
proc tabulate data=sales out=work.tabout;
title 'Proc Tabulate';
class grp subgrp;
var amt;
table grp*subgrp,
amt*(n sum pctsum);
run;

proc print data=work.tabout;
title 'output dataset from proc tabulate has 4 observations';
run;

[/pre]

Then PROC TABULATE would "collapse" or summarize the data based on the unique combinations of grp*subgrp that I had. The output from the PROC PRINT would be:
[pre]
output dataset from proc tabulate has 4 observations

amt_
PctSum_
Obs grp subgrp _TYPE_ _PAGE_ _TABLE_ amt_N amt_Sum 00

1 aa ww 11 1 1 2 3 20.0000
2 bb xx 11 1 1 1 3 20.0000
3 cc yy 11 1 1 1 4 26.6667
4 dd zz 11 1 1 1 5 33.3333

[/pre]

This does not indicate that an observation is missing, just that, given my table statement and my combination of CLASS Variables, PROC TABULATE performed some summarizing.

Look at the amt_N column ... 3 of the observations in the output dataset only had 1 observation contributing to the statistics, but GRP=AA and SUBGRP=WW statistics were derived from 2 observations.

Perhaps I misunderstood your original question. Since you had working PROC TABULATE code, I thought you wanted the summarizing that TABULATE does and just wanted an output table that was the equivalent of the TABULATE summary.

If you do not want ANY summarization at all, then you should not use PROC TABULATE, you should use either PROC PRINT or PROC REPORT to create a "detail" or "listing" report.

You could get this kind of report from PROC REPORT using the above 5 lines of data.
[pre]
detail listing with pctsum

Sum of Percent
grp subgrp Amt Sum
aa ww 1 6.67%
aa ww 2 13.33%
bb xx 3 20.00%
cc yy 4 26.67%
dd zz 5 33.33%
========= =========
15 100.00%
========= =========

[/pre]

If you do not want PROC TABULATE to summarize your data, then you might wish to review what other procedures can do for your in regards to reports by reading the SAS documentation or contacting Tech Support for more help.

cynthia

Here's the program that I used to generate the above reports. In addition, there's an extra example using data called DIFFSALES -- which collapses down from 5 observations to 2 observations in the output dataset:
[pre]
data sales;
infile datalines;
input grp $ subgrp $ amt;
datalines;
aa ww 1
aa ww 2
bb xx 3
cc yy 4
dd zz 5
;
run;

ods listing;
option nocenter;

proc print data=sales;
title 'original data set has 5 observations';
run;

proc report data=sales nowd;
title 'detail listing with pctsum';
column grp subgrp amt amt=pamt;
define grp / display;
define subgrp / display;
define amt / sum 'Sum of Amt';
define pamt / pctsum 'Percent Sum' f=percent9.2;
rbreak after / summarize dol dul;
run;

proc tabulate data=sales out=work.tabout;
title 'Proc Tabulate';
class grp subgrp;
var amt;
table grp*subgrp,
amt*(n sum pctsum);
run;

proc print data=work.tabout;
title 'output dataset from proc tabulate has 4 observations';
run;

data diffsales;
infile datalines;
input grp $ subgrp $ amt;
datalines;
aa ww 1
aa ww 2
bb xx 3
bb xx 4
bb xx 5
;
run;

proc tabulate data=diffsales out=work.diffout;
title 'Proc Tabulate Using DIFFSALES data';
class grp subgrp;
var amt;
table grp*subgrp,
amt*(n sum pctsum);
run;

proc print data=work.diffout;
title 'DIFFOUT output dataset from proc tabulate has 2 observations';
run;
[/pre]
N/A
Posts: 0

Re: Using Proc Tabulate

Posted in reply to Cynthia_sas
Hi Cynthia:

Thanks so much for your helpful explanations! I appreciate your time!!

I see what you mean if there are two 'aa's in the dataset. You helped me figure out the problem! There was a duplicate observation in my dataset!

Thanks so much!!
teresa
Ask a Question
Discussion stats
  • 7 replies
  • 305 views
  • 0 likes
  • 2 in conversation