DATA Step, Macro, Functions and more

Proc Tabulate and subsets of a variable

Reply
Contributor
Posts: 24

Proc Tabulate and subsets of a variable

A client has requested a table comparing national data to two different levels of subsetting--a single state, and a single city--all in one table. Is there a way to do this in a single Proc Tabulate step, or do I need to run it multiple times and collect the data together another way (either manually in Excel or through ODS)?

Example desired output:

        | National | State | City

Var1|                |          |

Var2|                |          |

Var3|                |          |

etc...

Super User
Posts: 11,343

Re: Proc Tabulate and subsets of a variable

You don't mention any variable names or requested statistics but somethhing like this might get you started:

proc tabulate data=have;

     class national;

     class state;

     class city;

     var var1 var2 var3 ;

     table (var1 var2 var3)* (sum mean stddev),  /* pick your statistics*/

           National state city;

run;

This will create a separate column under state for each level of the state variable in the data set and similar for city. I suspect that may be a bit cumbersome. Unless your variables outnumber the total number of states and cities you might suggest that the row variables should be state and city with the other variables as columns.

If  "National" could be the summary of all of the data in your set, it doesn't need to be a different variable the ALL class would suffice as it will summarize all records, just add a label to reflect what it means.

instead of National state city the syntax would be:

     all state city;

If your location variables are a single variable that contains sufficient information to identify state and city you may be able to use a multilabel format.

You can also create multiple tables in a single call to proc tabulate, just add an additional Table statement. It may make more sense to have separate tables for national , state and city.

With out more details of your data structure, coding and actual desired layout that's about all I can provide at this time.

Contributor
Posts: 24

Re: Proc Tabulate and subsets of a variable

My code as it stands now:

proc tabulate data=booksales format=comma8.1;
    by year;
    class books;
    table all='All Sales' books='', all="National"*(n colpctn='%') /printmiss;
    format books books.;
run;
proc tabulate data=booksales format=comma8.1;
    by year;
    class books state;
    table all='All Sales' books='', state=""*(n colpctn='%') /printmiss;
    format books books. state state.;
    where state=10;
run;
proc tabulate data=booksales format=comma8.1;
    by year;
    class books city;
    table all='All Sales' books='', city=""*(n colpctn='%') /printmiss;
    format books books. city city.;
    where city=83;
run;

So, this gives me all of the data I need, but it puts it into separate outputs, which makes it harder to efficiently import into Excel or a .pdf. For this particular project it's doable (they only want one city in one state for comparison), but for anything larger it would quickly become time consuming. I know enough to write out this statement:

proc tabulate data=booksales format=comma8.1;

    by year;

    class books state city;

    table all="All Sales" books="", (all="National" state="" city="")*(n colpctn="%")/printmiss;

    format books books. state state. city city.;

run;

But I don't know how (or if it's even possible) to selectively limit the analyses. Unless I've missed something, I know just putting in a "where city=83" would limit the all and state levels in addition to the city, which would make it kind of useless.

Super User
Posts: 19,869

Re: Proc Tabulate and subsets of a variable

I can't quite visualize your requirements, sample have/want are always helpful.

I *think* because you're aggregating all  your totals in one place and want selective output you'll need to pre-process the data and then output it.  Since you're calculations include N and colpctn that shouldn't be too difficult, via proc freq.

Super User
Posts: 11,343

Re: Proc Tabulate and subsets of a variable

Separate where clauses will require separate tabulate calls unless you do a lot of preprocessing.

If the issue is getting data into Excel or PDF then an ODS 'sandwich' will send all of the output to a file.

ods pdf file="c:\myfolder\report.pdf" ;

<tabulate calls>

ods pdf close;

Excel has a couple of different ways either ODS Tagsets.excelxp or the MS office tagset.

Ask a Question
Discussion stats
  • 4 replies
  • 663 views
  • 0 likes
  • 3 in conversation