Hi:
Would anyone have suggestions for how I could write code to combine proc freq and proc means into one output dataset which would give me tables organized like the one below?
I'm not sure if it's possible to switch the proc freq so the N is in a row (instead of a column) and then show the mean of Covar1 for all the observations which have the valid values of a variable I'd like to cross tab each covariate with.
Var:ValidValue1 | Var:ValidValue2 | Var:ValidValue3 | Var:ValidValue4 | |
N | ||||
Covar1 | ||||
Covar2 | ||||
Covar3 | ||||
Covarn |
I appreciate questions and any guidance.
Thanks!
A variable can be either CLASS or VAR but not both. You have that variable as both and the following isn't legal syntax:
var Firm_employee_n*(mean std);
The statistics requested appear in the table statement.
Yes Proc Tabulate generates output data sets add and OUT= clause on the Proc statement.
Proc tabulate data=have out=lib.outset ;
Generate one and look at the resultant set. It isn't always the easiest format to understand because of nesting and the fact that multiple tables can be generated by one proc tabulate call.
I think you need to post a more detailed example of what you're looking for, sample input and output data in the format you'd like would be helpful.
In general, proc tabulate is a good way to combine something like that OR proc report can also do a lot of summary/counts OR you may need to generate the tables and customize the output.
I THINK you might actually be looking for a report procedure such as proc tabulate or proc report.
I'm not sure exactly what role N plays in your example. Is it the number of records that have values of the VAR?
I am not going to use as simple VAR as a variable name as it gets confusing in syntax, I am going to call it VAR1.
Proc tabulate data=have;
class VAR1;
var Covar: ; /* this shortcut list would have to be replaced with your actual variable names here and in the table below*/
table n (covar: *mean=''),
Var1
;
run;
Will give a row with the count of records for each value of Var1 and mean of the Covar variables, one per line.
NOTE: If you have many values for VAR1 and sent the output to RTF or PDF you'll likely encounter table wrapping.
Thanks ballardw! proc tabulate is looking like the better option. Do you know if proc tabulate can create an output dataset?
Also, when I use the following code
proc tabulate data=sample;
class Firm_employee_n ; /*edit covars here*/
var Firm_employee_n*(mean std);
table Firm_employee_n, estabgroup ;
run;
The log error says :
202 var Firm_employee_n*(mean std);
-
22
200
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, _ALL_, _CHARACTER_,
_CHAR_, _NUMERIC_.
ERROR 200-322: The symbol is not recognized and will be ignored.
Do we know why this is ?
A variable can be either CLASS or VAR but not both. You have that variable as both and the following isn't legal syntax:
var Firm_employee_n*(mean std);
The statistics requested appear in the table statement.
Yes Proc Tabulate generates output data sets add and OUT= clause on the Proc statement.
Proc tabulate data=have out=lib.outset ;
Generate one and look at the resultant set. It isn't always the easiest format to understand because of nesting and the fact that multiple tables can be generated by one proc tabulate call.
Thanks ballardw!
The proc tabulate code you guided me to use is what I was looking for.
In my SAS Ouput window, the table dimensions appear as expected but not in ODS tagsets excel--would you have an idea why this is?
code
proc tabulate data=sample out=tab_estabgroup (drop=_TYPE_ _PAGE_ _TABLE_);
class estabgroup ; /*edit covars here*/
var Firm_employee_n;
table Firm_employee_n*(mean std), estabgroup ;
run;
ods ods tagsets.excelxp file='S:\Projects\DOL OSHA EVAL\Output\Pilot Study\Baseline_descriptives_b.xls' style=journal;
ods tagsets.excelxp options(sheet_name="by_estabgroup");
proc print data=tab_estabgroup noobs; run;
ods tagsets.excelxp close;
Output window
---------------------------------------------------------------------
| | estabgroup |
| |---------------------------------------------------|
| | vv1| vv2| vv3| vv4|
|-----------------------+------------+------------+------------+------------|
|firm_emplo-| Mean | 11| 11| 11| 11|
|yee_n |-----------+------------+------------+------------+------------|
| |Std | 11| 11| 11| 11|
-----------------------------------------------------------------------------
But in the Excel output the estabgroup and firm_emp... var are switched. I'm not sure why--would you know?
estabgroup | firm_employee_n_Mean | firm_employee_n_Std |
vv1 | 11 | 11 |
vv2 | 11 | 11 |
vv3 | 11 | 11 |
vv4 | 11 | 11 |
The table via the out statement does not have the same structure as the table displayed in the output.
But why not run the proc tablet between the ODS statements?
ods ods tagsets.excelxp file='S:\Projects\DOL OSHA EVAL\Output\Pilot Study\Baseline_descriptives_b.xls' style=journal;
ods tagsets.excelxp options(sheet_name="by_estabgroup");
proc tabulate data=sample ;
class estabgroup ; /*edit covars here*/
var Firm_employee_n;
table Firm_employee_n*(mean std), estabgroup ;
run;
ods tagsets.excelxp close;
Thanks for problem solving, Reeza! I got the output window table to export this way
Reeza is absolutely correct.
The dataset created by Proc Tabulate has one row for each combination of values of CLASS variables and then the statistics for each of the VAR variables for that combination. If you have a single class variable the output will actually look a lot like Proc Means/Summary with autoname option used for the variable statistics.
As I mentioned, proc tabulated generated datasets are somewhat complex and I've never seen a good example of a simple use for them. I use them but do a lot of post processing to get the results I want. The reports I use them for generally involve things that could be done with Proc Summary but sorting through the various combinations of _type_ I would needs gets to be tedious and the tabulate simplifies that somewhat.
Thanks for elaborating on how you use proc tabulate, ballardw--it's helpful advice. Question: My understanding was I should be able to combine frequencies and statistics in proc tabulate. However, my current proc tab step doesn't produce a row of Ns (frequencies) for the estabgroup variable. Would you know why?
proc tabulate data=sample out=tab_estabgroup (drop=_TYPE_ _PAGE_ _TABLE_);
class estabgroup ; /*edit covars here*/
var Firm_employee_n;
table Firm_employee_n*(mean std), estabgroup ;
run;
FREQ is N in most summary tables. You haven't requested that calculation. I'm not exactly sure where you need it, so you can play around with the location of the N statistic.
proc tabulate data=sample out=tab_estabgroup (drop=_TYPE_ _PAGE_ _TABLE_);
class estabgroup ; /*edit covars here*/
var Firm_employee_n;
table Firm_employee_n*(mean std), estabgroup*(N) ;
run;
Estabgroup is a character variable and I'm not allowed to it in the var statement then request a statistic apparently... Is that how you'd interpret the following LOG messages?:
374 proc tabulate data=sample out=tab_estabgroup (drop=_TYPE_ _PAGE_ _TABLE_);
375
376 var Firm_employee_n estabgroup;
ERROR: Variable estabgroup in list does not match type prescribed for this list.
377 table Firm_employee_n*(mean std), estabgroup*N ;
378 run;
Alternative code
369 proc tabulate data=sample out=tab_estabgroup (drop=_TYPE_ _PAGE_ _TABLE_);
370 class estabgroup ; /*edit covars here*/
371 var Firm_employee_n;
372 table Firm_employee_n*(mean std), estabgroup*N ;
373 run;
ERROR: There are multiple statistics associated with a single table cell in the following nesting :
firm_employee_n * Mean * estabgroup * N.
ERROR: There are multiple statistics associated with a single table cell in the following nesting :
firm_employee_n * Std * estabgroup * N.
Like I mentioned I'm not exactly sure where you need it.
Try:
proc tabulate data=sample out=tab_estabgroup (drop=_TYPE_ _PAGE_ _TABLE_);
class estabgroup ; /*edit covars here*/
var Firm_employee_n;
table Firm_employee_n*(N mean std), estabgroup;
run;
If you go back to my first code example you will notice a little "n" appearing all by itself on the row level of the table. That is what would request the frequencies for each level of the column variable(s).
table n Firm_employee_n*( mean std), estabgroup;
Thanks ballardw! This was what I was looking for.
I'm looking to combine the mean and standard deviation statistics into one cell: if I wasn't using proc tab, I may have used the concatenation operator to combine a mean var and a std deviation var. With proc tab, however, I'm not sure: I see proc tab has an option to remove the horizantal lines in a table. Would you know how to display mean and standard deviation in one cell, i.e.
mean
(std dev) ?
Thank you very much.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.