Proc means

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 131
Accepted Solution

Proc means

Hi:

Would anyone have suggestions for how I could write code to combine proc freq and proc means into one output dataset which would give me tables organized like the one below?

I'm not sure if it's possible to switch the proc freq so the N is in a row (instead of a column) and then show the mean of Covar1 for all the observations which have the valid values of a variable I'd like to cross tab each covariate with.

Var:ValidValue1Var:ValidValue2Var:ValidValue3Var:ValidValue4
N
Covar1
Covar2
Covar3
Covarn

I appreciate questions and any guidance.


Thanks!


Accepted Solutions
Solution
‎04-06-2015 12:42 PM
Grand Advisor
Posts: 10,210

Re: Proc means

A variable can be either CLASS or VAR but not both. You have that variable as both and the following isn't legal syntax:

var Firm_employee_n*(mean std);

The statistics requested appear in the table statement.

Yes Proc Tabulate generates output data sets add and OUT= clause on the Proc statement.

Proc tabulate data=have out=lib.outset ;

Generate one and look at the resultant set. It isn't always the easiest format to understand because of nesting and the fact that multiple tables can be generated by one proc tabulate call.

View solution in original post


All Replies
Grand Advisor
Posts: 17,338

Re: Proc means

I think you need to post a more detailed example of what you're looking for, sample input and output data in the format you'd like would be helpful.

In general, proc tabulate is a good way to combine something like that OR proc report can also do a lot of summary/counts OR you may need to generate the tables and customize the output.

Grand Advisor
Posts: 10,210

Re: Proc means

I THINK you might actually be looking for a report procedure such as proc tabulate or proc report.

I'm not sure exactly what role N plays in your example. Is it the number of records that have values of the VAR?

I am not going to use as simple VAR as a variable name as it gets confusing in syntax, I am going to call it VAR1.

Proc tabulate data=have;

     class VAR1;

     var Covar: ; /* this shortcut list would have to be replaced with your actual variable names here and in the table below*/

     table n  (covar: *mean=''),

               Var1

      ;

run;

Will give a row with the count of records for each value of Var1 and mean of the Covar variables, one per line.

NOTE: If you have many values for VAR1 and sent the output to RTF or PDF you'll likely encounter table wrapping.

Frequent Contributor
Posts: 131

Re: Proc means

Thanks ballardw! proc tabulate is looking like the better option. Do you know if proc tabulate can create an output dataset?

Also, when I use the following code

proc tabulate data=sample;

class Firm_employee_n ; /*edit covars here*/

var Firm_employee_n*(mean std);

table Firm_employee_n, estabgroup ;

run;

The log error says :

202  var Firm_employee_n*(mean std);

                        -

                        22

                        200

ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_, _CHARACTER_,

              _CHAR_, _NUMERIC_.

ERROR 200-322: The symbol is not recognized and will be ignored.

Do we know why this is ?

Solution
‎04-06-2015 12:42 PM
Grand Advisor
Posts: 10,210

Re: Proc means

A variable can be either CLASS or VAR but not both. You have that variable as both and the following isn't legal syntax:

var Firm_employee_n*(mean std);

The statistics requested appear in the table statement.

Yes Proc Tabulate generates output data sets add and OUT= clause on the Proc statement.

Proc tabulate data=have out=lib.outset ;

Generate one and look at the resultant set. It isn't always the easiest format to understand because of nesting and the fact that multiple tables can be generated by one proc tabulate call.

Frequent Contributor
Posts: 131

Re: Proc means

Thanks ballardw!

The proc tabulate code you guided me to use is what I was looking for.

In my SAS Ouput window, the table dimensions appear as expected but not in ODS tagsets excel--would you have an idea why this is?

code

proc tabulate data=sample  out=tab_estabgroup (drop=_TYPE_    _PAGE_    _TABLE_);

class  estabgroup ; /*edit covars here*/

var Firm_employee_n;

table Firm_employee_n*(mean std), estabgroup ;

run;

ods ods tagsets.excelxp file='S:\Projects\DOL OSHA EVAL\Output\Pilot Study\Baseline_descriptives_b.xls' style=journal;

ods tagsets.excelxp options(sheet_name="by_estabgroup");

proc print data=tab_estabgroup noobs; run;

ods tagsets.excelxp close;

Output window

---------------------------------------------------------------------

            |                       |                    estabgroup                     |

            |                       |---------------------------------------------------|

            |                       |       vv1|    vv2|    vv3|    vv4|

            |-----------------------+------------+------------+------------+------------|

            |firm_emplo-|   Mean       |      11|       11|       11|       11|

            |yee_n      |-----------+------------+------------+------------+------------|

            |                      |Std        |               11|       11|      11|       11|

            -----------------------------------------------------------------------------

But in the Excel output the estabgroup and firm_emp... var are switched.  I'm not sure why--would you know?

estabgroupfirm_employee_n_Meanfirm_employee_n_Std
vv11111
vv21111
vv31111
vv41111
Grand Advisor
Posts: 17,338

Re: Proc means

The table via the out statement does not have the same structure as the table displayed in the output.

But why not run the proc tablet between the ODS statements?

ods ods tagsets.excelxp file='S:\Projects\DOL OSHA EVAL\Output\Pilot Study\Baseline_descriptives_b.xls' style=journal;

ods tagsets.excelxp options(sheet_name="by_estabgroup");

proc tabulate data=sample ;

class  estabgroup ; /*edit covars here*/

var Firm_employee_n;

table Firm_employee_n*(mean std), estabgroup ;

run;

ods tagsets.excelxp close;

Frequent Contributor
Posts: 131

Re: Proc means

Thanks for problem solving, Reeza!  I got the output window table to export this way

Grand Advisor
Posts: 10,210

Re: Proc means

Reeza is absolutely correct.

The dataset created by Proc Tabulate has one row for each combination of values of CLASS variables and then the statistics for each of the VAR variables for that combination. If you have a single class variable the output will actually look a lot like Proc Means/Summary with autoname option used for the variable statistics.

As I mentioned, proc tabulated generated datasets are somewhat complex and I've never seen a good example of a simple use for them. I use them but do a lot of post processing to get the results I want. The reports I use them for generally involve things that could be done with Proc Summary but sorting through the various combinations of _type_ I would needs gets to be tedious and the tabulate simplifies that somewhat.

Frequent Contributor
Posts: 131

Re: Proc means

Thanks for elaborating on how you use proc tabulate, ballardw--it's helpful advice.  Question:   My understanding was I should be able to combine frequencies and statistics in proc tabulate.  However, my current proc tab step doesn't produce a row of Ns (frequencies) for the estabgroup variable.  Would you know why?

proc tabulate data=sample  out=tab_estabgroup (drop=_TYPE_    _PAGE_    _TABLE_);

class  estabgroup ; /*edit covars here*/

var Firm_employee_n;

table Firm_employee_n*(mean std), estabgroup ;

run;

Grand Advisor
Posts: 17,338

Re: Proc means

FREQ is N in most summary tables. You haven't requested that calculation. I'm not exactly sure where you need it, so you can play around with the location of the N statistic.

proc tabulate data=sample  out=tab_estabgroup (drop=_TYPE_    _PAGE_    _TABLE_);

class  estabgroup ; /*edit covars here*/

var Firm_employee_n;

table Firm_employee_n*(mean std), estabgroup*(N) ;

run;

Frequent Contributor
Posts: 131

Re: Proc means

Estabgroup is a character variable and I'm not allowed to it in the var statement then request a statistic apparently...  Is that how you'd interpret the following LOG messages?:

374  proc tabulate data=sample  out=tab_estabgroup (drop=_TYPE_  _PAGE_  _TABLE_);

375

376  var Firm_employee_n estabgroup;

ERROR: Variable estabgroup in list does not match type prescribed for this list.

377  table Firm_employee_n*(mean std), estabgroup*N ;

378  run;

Alternative code

369  proc tabulate data=sample  out=tab_estabgroup (drop=_TYPE_  _PAGE_  _TABLE_);

370  class  estabgroup ; /*edit covars here*/

371  var Firm_employee_n;

372  table Firm_employee_n*(mean std), estabgroup*N ;

373  run;

ERROR: There are multiple statistics associated with a single table cell in the following nesting :

       firm_employee_n * Mean * estabgroup * N.

ERROR: There are multiple statistics associated with a single table cell in the following nesting :

       firm_employee_n * Std * estabgroup * N.

Grand Advisor
Posts: 17,338

Re: Proc means

Like I mentioned I'm not exactly sure where you need it.

Try:

proc tabulate data=sample  out=tab_estabgroup (drop=_TYPE_    _PAGE_    _TABLE_);

class  estabgroup ; /*edit covars here*/

var Firm_employee_n;

table Firm_employee_n*(N mean std), estabgroup;

run;

Grand Advisor
Posts: 10,210

Re: Proc means

If you go back to my first code example you will notice a little "n" appearing all by itself on the row level of the table. That is what would request the frequencies for each level of the column variable(s).

table n Firm_employee_n*( mean std), estabgroup;

Frequent Contributor
Posts: 131

Re: Proc means

Thanks ballardw!  This was what I was looking for.

I'm looking to combine the mean and standard deviation statistics into one cell:  if I wasn't using proc tab, I may have used the concatenation operator to combine a mean var and a std deviation var.  With proc tab, however, I'm not sure:  I see proc tab has an option to remove the horizantal lines in a table.  Would you know how to display mean and standard deviation in one cell, i.e.

mean

(std dev)   ?

Thank you very much.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 542 views
  • 6 likes
  • 3 in conversation