- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am new to formatting in SAS and I am struggling to generate some cross tables using PROC REPORT and macro.
The data looks like this:
ID | var1 | var2 |
1 | 1 | 0 |
2 | 0 | 1 |
3 | 1 | 1 |
4 | 1 | 1 |
5 | 1 | 1 |
6 | 0 | 0 |
7 | 1 | 1 |
8 | 1 | 1 |
9 | 0 | 1 |
10 | 1 | 1 |
The cross table I want is something like:
| Var1 |
| |
Var2 | 1 [N (%)] | 0 [N %] | Total |
1 | xx (a%) | xx (b%) | n (100%) |
0 | xx (c%) | xx (d%) | n (100%) |
Total | n (%) | n (%) | N |
I tried to mimic some code as follow but can not get the percentage in the parentheses, and can not get the percentage by row (a%+b%=100%; c%+d%=100%):
%MACRO report2(dataset, var1, var2);
data mydat;
set &dataset;
keep id &var1 &var2;
run;
ods excel options(sheet_name="&var1.|&var2.");
proc report data = mydat missing nowd;
column &var2 &var1, (n pctn) ('Total' n);
define &var2 /group "&var2";
define &var1 /across order = internal "&var1";
define n/format =8. "N";
define pctn / "%" format = percent8.1;
rbreak after / summarize;
compute after;
&var2 = "Total";
endcomp;
run;
%MEND report2;
There are some trickier problem, but I am starting with this simpler one. Hope someone can help. Thanks a lot!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
I assume you want something like THIS:
Where the number in each cell is the percent of the ROW total, not the percent of the Grand Total. PROC TABULATE has special percent keywords for this (ROWPCTN and ROWPCTSUM, along with COLPCTN and COLPCTSUM) but PROC REPORT does not have those keywords available.
Instead, if you want to calculate a ROW percent (so that each row adds up to 100% than you need to calculate it yourself. I used an ACROSS item to show a separate column for F vs M.
The COMPUTE block in this case, is more complex, since I could not use the simple PCTN for the calculation. But first, I had to make some fake data since I only wanted to show 2 rows plus the total row on the report:
data fakedata;
length grpvar $10;
set sashelp.class;
if age le 13 then grpvar = 'Group 1';
else grpvar = 'Group 2';
run;
Then I had to create some calculated variables to hold the totals, but, due to the use of ACROSS items, the variables needed to be referred to as their absolute column numbers:
So when you see references to _c3_, _c4_ and _c5_, etc, those are needed any time you use an ACROSS item and you need to calculate a value that is also nested under an ACROSS item. In this case, the SEX variable for F and M was defined as an ACROSS item. That means the STOT value and the PCT value for F needs to be distinguished from the STOT and the PCT value for M in order to calculate a separate CALCCOL value for each unique group.
Here's the PROC REPORT code to generate the above report and the report that I initially showed:
proc report data=fakedata;
title '3a) Only showing the calculated column and total';
column ('Group' grpvar) n=rtot sex,(n=stot calcpct calccol) ('Total' totcol);
define grpvar / group 'Grpvar';
define rtot / 'rtot' /* noprint */;
define sex / across "With computed % and Row %";
define stot / 'stot' /* noprint */;
define calcpct / computed 'Pct' f=percent7.1 /* noprint */;
define calccol / computed 'CALCCOL'
style(column)={just=r};
define totcol / computed 'TOTCOL'
style(column)={just=r};
compute calcpct;
_c4_ = divide(_c3_ , rtot);
_c7_ = divide(_c6_ , rtot);
endcomp;
compute calccol / character length = 15;
if _c3_ gt . then
_c5_ = put(_c3_,2.0)||' ('||right(put(_c4_,percent7.1))||')';
if _c6_ gt . then
_c8_ = put(_c6_,2.0)||' ('||right(put(_c7_,percent7.1))||')';
endcomp;
compute totcol / character length = 15;
tmp = divide(rtot, rtot);
totcol = put(rtot,2.0)||' ('||right(put(tmp,percent7.1))||')';
endcomp;
rbreak after / summarize;
compute after;
grpvar = 'Total';
endcomp;
run;
proc report data=fakedata;
title '3b) Only showing the calculated column and total';
column ('Group' grpvar) n=rtot sex,(n=stot calcpct calccol) ('Total' totcol);
define grpvar / group ' ';
define rtot / 'rtot' noprint ;
define sex / across "With computed % and Row %";
define stot / 'stot' noprint ;
define calcpct / computed 'Pct' f=percent7.1 noprint ;
define calccol / computed 'CALCCOL'
style(column)={just=r};
define totcol / computed ' '
style(column)={just=r};
compute calcpct;
_c4_ = divide(_c3_ , rtot);
_c7_ = divide(_c6_ , rtot);
endcomp;
compute calccol / character length = 15;
if _c3_ gt . then
_c5_ = put(_c3_,2.0)||' ('||right(put(_c4_,percent7.1))||')';
if _c6_ gt . then
_c8_ = put(_c6_,2.0)||' ('||right(put(_c7_,percent7.1))||')';
endcomp;
compute totcol / character length = 15;
tmp = divide(rtot, rtot);
totcol = put(rtot,2.0)||' ('||right(put(tmp,percent7.1))||')';
endcomp;
rbreak after / summarize;
compute after;
grpvar = 'Total';
endcomp;
run;
Hope this helps,
Cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
I do not see how the data you show can generate the report that you want. PROC REPORT is able to produce this type of demographic report. I have 3 examples of a report like this in my 2008 paper on Creating Complex Reports. Look at this previous forum posting https://communities.sas.com/t5/ODS-and-Base-Reporting/Proc-Report-Create-a-custom-table/td-p/486398 for the paper links.
Typically with this kind of demographic report, you get each statistic, such as N and PCTN or MIN-MAX in separate cells and then you create a character string that does things like adding parentheses or punctuation to make the string you want to show on your report.
Hope this helps,
Cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply. I have updated the report table, now it directly linked to the data.
Thanks for the paper too, I think I read your paper before and got a bit lost. I am going to look into it again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
Here's an simpler example than the one in the paper:
Hope this helps,
Cynthia
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Cynthia, I find the compute statement you showed really helps putting the calculated percentage in the parentheses.
But I am still struggling with how to get the percentage by row (a%+b%=100% in the example of my post), I can only get the percentage by column using the code I have above. Could you please help me with that?
Many thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
I assume you want something like THIS:
Where the number in each cell is the percent of the ROW total, not the percent of the Grand Total. PROC TABULATE has special percent keywords for this (ROWPCTN and ROWPCTSUM, along with COLPCTN and COLPCTSUM) but PROC REPORT does not have those keywords available.
Instead, if you want to calculate a ROW percent (so that each row adds up to 100% than you need to calculate it yourself. I used an ACROSS item to show a separate column for F vs M.
The COMPUTE block in this case, is more complex, since I could not use the simple PCTN for the calculation. But first, I had to make some fake data since I only wanted to show 2 rows plus the total row on the report:
data fakedata;
length grpvar $10;
set sashelp.class;
if age le 13 then grpvar = 'Group 1';
else grpvar = 'Group 2';
run;
Then I had to create some calculated variables to hold the totals, but, due to the use of ACROSS items, the variables needed to be referred to as their absolute column numbers:
So when you see references to _c3_, _c4_ and _c5_, etc, those are needed any time you use an ACROSS item and you need to calculate a value that is also nested under an ACROSS item. In this case, the SEX variable for F and M was defined as an ACROSS item. That means the STOT value and the PCT value for F needs to be distinguished from the STOT and the PCT value for M in order to calculate a separate CALCCOL value for each unique group.
Here's the PROC REPORT code to generate the above report and the report that I initially showed:
proc report data=fakedata;
title '3a) Only showing the calculated column and total';
column ('Group' grpvar) n=rtot sex,(n=stot calcpct calccol) ('Total' totcol);
define grpvar / group 'Grpvar';
define rtot / 'rtot' /* noprint */;
define sex / across "With computed % and Row %";
define stot / 'stot' /* noprint */;
define calcpct / computed 'Pct' f=percent7.1 /* noprint */;
define calccol / computed 'CALCCOL'
style(column)={just=r};
define totcol / computed 'TOTCOL'
style(column)={just=r};
compute calcpct;
_c4_ = divide(_c3_ , rtot);
_c7_ = divide(_c6_ , rtot);
endcomp;
compute calccol / character length = 15;
if _c3_ gt . then
_c5_ = put(_c3_,2.0)||' ('||right(put(_c4_,percent7.1))||')';
if _c6_ gt . then
_c8_ = put(_c6_,2.0)||' ('||right(put(_c7_,percent7.1))||')';
endcomp;
compute totcol / character length = 15;
tmp = divide(rtot, rtot);
totcol = put(rtot,2.0)||' ('||right(put(tmp,percent7.1))||')';
endcomp;
rbreak after / summarize;
compute after;
grpvar = 'Total';
endcomp;
run;
proc report data=fakedata;
title '3b) Only showing the calculated column and total';
column ('Group' grpvar) n=rtot sex,(n=stot calcpct calccol) ('Total' totcol);
define grpvar / group ' ';
define rtot / 'rtot' noprint ;
define sex / across "With computed % and Row %";
define stot / 'stot' noprint ;
define calcpct / computed 'Pct' f=percent7.1 noprint ;
define calccol / computed 'CALCCOL'
style(column)={just=r};
define totcol / computed ' '
style(column)={just=r};
compute calcpct;
_c4_ = divide(_c3_ , rtot);
_c7_ = divide(_c6_ , rtot);
endcomp;
compute calccol / character length = 15;
if _c3_ gt . then
_c5_ = put(_c3_,2.0)||' ('||right(put(_c4_,percent7.1))||')';
if _c6_ gt . then
_c8_ = put(_c6_,2.0)||' ('||right(put(_c7_,percent7.1))||')';
endcomp;
compute totcol / character length = 15;
tmp = divide(rtot, rtot);
totcol = put(rtot,2.0)||' ('||right(put(tmp,percent7.1))||')';
endcomp;
rbreak after / summarize;
compute after;
grpvar = 'Total';
endcomp;
run;
Hope this helps,
Cynthia