Hi,
I have a requirement to produce a series of multi-way contingency tables (2-3-4 maybe even 5 way if it is interpretable) to introduce a possible logistic model with many more than 5 variables. Since this will eventually take the form of a document to our clients, it will likely be done through ods pdf, rtf or html so I am somewhat flexible to that extent although I would like it to be as automated as possible so that any change in the data would require only to rerun the program and maybe to very minimal manual work beyond this. I'm basically looking at the following:
data have;
input var1 $1. var2 $1. outcome $1.;
datalines;
0,0,0
0,0,1
0,1,0
0,1,1
1,0,0
1,0,1
1,1,0
1,1,1
;
run; /* Where each record would repeat a different frequency number of times and could possibly have more than just 2 classes but that is irrelevant to explaining what I wish to achieve */
ods pdf file="c:\temp.pdf";
proc freq data=have;
table var1*var2*outcome / LIST;
run;
ods pdf close;
The output generated is of a form similar to the table below.
var1 | var2 | outcome=0 | outcome=1 |
var1=0 | var2=0 | 1 | 1 |
var1=0 | var2=1 | 1 | 1 |
var1=1 | var2=0 | 1 | 1 |
var1=1 | var2=1 | 1 | 1 |
However, in litterature and which is also what I want to achieve, the outer variables (left most columns) values are normally only repeated when they change and not for every possible row. So my desired output would be something similar to
var1 | var2 | outcome=0 | outcome=1 |
var1=0 | var2=0 | 1 | 1 |
var2=1 | 1 | 1 | |
var1=1 | var2=0 | 1 | 1 |
var2=1 | 1 | 1 |
I am looking for a way to achieve something similar in SAS. I'm absolutely not familiar with proc template and don't know if any other procedure (proc surveyfreq or proc tabulate maybe?) could achieve my desired output without playing with proc template. Nontheless, I am looking for the community help at the bare minimum to know if it is worth investing the time to learn proc tabulate for crosstabs (aka is it possible to achieve this output via proc template on crosstabs?). I believe I could achieve the desired output with heavy dataset manipulations and a proc print but it would be atrocious to double transpose the output from proc freq and blank out fields and whatnot for a job that may very well produce about a dozen 2-way tables, 66 3-way tables and (4 in 12) 4-way tables etc.
Is it possible to do the above manipulation to the basic crosstabs output with proc template?
Are there alternative procedures that could achieve the desired output?
Are there additionnal options or suboptions on the table / LIST statement that could result in the same formating?
Anything else that does not involve parsing the tables and using DDE to blank the appropriate fields?
Thanks
Vince
Your example output, if I understand your question, may be generated with proc tabulate depending on how important it is to display "var1=" with the value. Tabulate will allow multiple levels of nesting if you need more.
proc tabulate data=have;
class var1 var2 outcome;
table var1*var2, outcome*n=''*f=f6.0;
run;
If you actually need to display the text var1= with the value then custom picture formats might come into play.
I forgot to mention, if it is possible to achieve with proc template on crosstabs, if anyone could suggest good documentation with a detailed tutorial, that would be much appreciated. I can have statscan library order the book if we don't have it on site especially since ultimately, a template code that achieves this result could very well be shared on our network since "sadly" contingency tables are extremely frequent in our clients' requirements.
Apparently I can't edit the original post. Remove the , , delimiters for the data have; step, I changed my mind in syntax half way through. Sorry.
Your example output, if I understand your question, may be generated with proc tabulate depending on how important it is to display "var1=" with the value. Tabulate will allow multiple levels of nesting if you need more.
proc tabulate data=have;
class var1 var2 outcome;
table var1*var2, outcome*n=''*f=f6.0;
run;
If you actually need to display the text var1= with the value then custom picture formats might come into play.
Hi Ballardw,
Indeed it did what I wanted to achieve.
May I ask what exactly the n=''*f=f6.0 does? I mean I understand that it turns into the frequency count but I'm trying to figure out how/what/when/where this syntax comes from and I haven't found it yet reading through proc tabulate documentation seeking for it.
Many thanks,
Vince
N is, as you noted, the count statistic, ='text' can be used to customize the label from the default of "n". With no text it suppress the statistic column heading. The *f= is used to assign a display format for the preceding statistic. By default tabulate will show 2 decimals in formatted statistics and since n is asking for a count I generally override that behavior to show no decimals. In this case the format F6.0 was a guess for enough columns to show an expected range of counts.
Excellent thank you very much, very good explanations.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.