Proc Tabulate Question

deleted_user · Posted 08-02-2007 11:30 AM

I'm trying to use Proc Tabulate to create a report that will appear as follwos:

I need to have college names over gender types, and then have that over ethnicity types (these will be my row lines), and have all of that listed out by department names going down. I want the report to look like this:

..........................................College Name...................................
..................Male.......................................................Female......
..........Ethnic1 Ethnic2 Ethnic3 .................Ethnic1 Ethnic2 Ethnic3
Dept1.....3......... 4.......... 7..........................0...........10...........11
Dept2.....2..........3...........1.........................99...........2..............5
Dept3.....0.........200........0..........................3............7..............6

I'm trying to wrap my head around what might go on the CLASS line and what goes on the TABLE line. Several things I have tried end up giving me ALL the data straight across as opposed to showing the data in row and column format.

If someone might offer a little advice it would be welcome!! Thanks!

Cynthia_sas · Posted 08-02-2007 12:08 PM

Hi:
Using PROC TABULATE was one of the topics/tutorials that I gave at PharmaSUG a few years ago. If you go to this web site:
http://www.lexjansen.com/cgi-bin/psug.asp?x=ttab&s=psug_t
And look for the presentation, entitled "Good Table Manners with PROC TABULATE", you will find a link to download the code for the presentation. That zip file also contains a PDF file of the slides.

The bottom line with TABULATE is that the CLASS statement sets categories or buckets for collecting observations. The VAR statement sets which variables should be used for analysis in the buckets. If you do not have any VAR statements, then the default statistic which you would get would be the COUNT or N statistic.

The only other thing you have to know about TABULATE is that the TABLE statement, then arranges the CLASS and/or VAR variables in a table structure. In addition, the comma operator is used in the TABLE statement to build dimensions:
[pre]
table region, subsidiary, product;
[/pre]
would have region in the PAGE dimension, subsidiary in the ROW dimension and product in the COLUMN dimension. Then:
[pre]
table region, product;
[/pre]
with ONE comma would have region in the ROW dimension, product in the COLUMN dimension. Another operator (the *) is used to cross or next variables, so for example:
[pre]
table region, subsidiary*product;
[/pre]
would have region in the ROW diimension and subsidiary crossed with product (or you could think of it as product values nested underneath subsidiary values) in the COLUMN dimension. A TABLE statement without ANY commas will give you all your variables in the COLUMN dimension, so at the very least, you have to introduce a COMMA and probably a couple of ASTERISKS in your TABLE statement to get the table structure that you want.

For your table sketch, even though you said that you consider COLLEGE, GENDER and ETHNIC to be "row lines" -- they would be considered COLUMN headings in terms of PROC TABULATE -- therefore, all 3 of those variables would be your COLUMN dimension. The only variable that comprises rows in your sketch are the DEPT variable levels (Dept1, Dept2, etc), therefore, DEPT would be in your ROW dimension:
[pre]
proc tabulate data=mydata format=comma6.;
class college gender ethnic dept;
table dept,
college*gender*ethnic;
run;
[/pre]
or if you had more than 1 college, you might want to make college a PAGE dimension:
[pre]
proc tabulate data=mydata format=comma6.;
class college gender ethnic dept;
table college,
dept,
gender*ethnic /
box=_page_;
run;
[/pre]

BTW, EG has an excellent table designer GUI interface for PROC TABULATE. To get to it you would select DESCRIBE --> SUMMARY TABLES. Once you set your Task Roles (Classification and Analysis variables), then the drag and drop table designer is under the Summary Tables choice in the navigation pane.

If you would prefer to read more about PROC TABULATE, these papers might prove helpful:
http://www2.sas.com/proceedings/sugi25/25/iv/25p159.pdf
http://www.laurenhaworth.com/publications/71-28.pdf
http://www2.sas.com/proceedings/sugi30/179-30.pdf

You'll also have to read in the TABULATE documentation about how missing values are treated and make sure you have the right MISSING option set with TABULATE, if you do have CLASS variables with possible missing values and you want those to show on the table.

cynthia

deleted_user · Posted 08-02-2007 12:51 PM

Thank you so very much for replying as quickly as you did! There certainly is a lot to learn, isn't there? I'll let you know as soon as I have a test report made from Proc Tabulate.

Thanks again!

Cynthia_sas · Posted 08-02-2007 01:57 PM

Yes, there -IS- a lot to learn. I call it "learning to think like SAS". But really, it boils down to learning one procedure at a time or one kind of functionality at a time. There are many resources to help you learn what you need to know (SUGI papers, user-group papers, books by users). We also have classes ... our beginning programming class (SAS Programming I: Essentials) is offered at various training center sites AND as a Live Web class, which you can take from your own computer without needing to travel. The same is true of our EG Query and Reporting class and the SAS Macro class.
http://support.sas.com/training/us/crs/prog1.html
http://support.sas.com/training/us/crs/macr.html
http://support.sas.com/training/us/crs/egqr.html

Pretty cool, huh? Sorry, I couldn't resist a plug for the classes, you gave me the perfect opening!

cynthia

deleted_user · Posted 08-02-2007 03:53 PM

Hi, Cynthia.....

I got the PROC TABULATE working in my stored process!! Thanks for your assistance! I have a question, though..... my report shows field names and the letter N, and I need to know how to remove that type of data. Here is a sketch of my report:

CAMPUS NAME...|........................................Gender_Desc.............................
...........................|...............Female................................................Male
...........................|....Primary_Ethnicity..................|..................Primary Ethnicity
...........................| Indian...Asian...Black................|....Indian...Asian...Black.......
...........................|.....N.........N..........N.................|.......N.........N.........N.
......DeptA............|.....23.......1...........15................|........6..........3.........5
......DeptB............|.....7..........4...........3.................|........0........32.........4

Within that, I want to remove the lines that show Gender_Desc and Primary_Ethnicty. And, will removing those names also remove the areas where those lines were? How do I remove those lines, while leaving the rest of my code intact?

deleted_user · Posted 08-02-2007 04:38 PM

Please disregard my last question....I figured it out and the report is now working!

Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Re: Proc Tabulate Question

Registration is open

SAS Training: Just a Click Away