Dear All,
I want the below output table with count and % in separate rows instead of columns by row & columns variables. Percent (%) below count (n)
Data Have
Id | Area | Gender | Race |
1 | 1 | 1 | 2 |
2 | 2 | 2 | 3 |
3 | 3 | 1 | 2 |
4 | 2 | 2 | . |
5 | 1 | 2 | 3 |
6 | 3 | 1 | 1 |
7 | 2 | 2 | 2 |
8 | 3 | 1 | 2 |
9 | 1 | 1 | 1 |
10 | 3 | 2 | 3 |
Output want
Gender | Race | |||||
Total | 1 | 2 | 1 | 2 | 3 | |
Area total | 10 | 5 | 5 | 2 | 4 | 3 |
Area1 | 3 | 2 | 1 | 1 | 1 | 1 |
30.0 | 40.0 | 20.0 | 50.0 | 25.0 | 33.3 | |
Area2 | 3 | 0 | 3 | 0 | 1 | 1 |
30.0 | 0.0 | 60.0 | 0.0 | 25.0 | 33.3 | |
Area3 | 4 | 3 | 1 | 1 | 2 | 1 |
40.0 | 60.00 | 20.00 | 50.00 | 50.00 | 33.33 |
Thank you in advance! Akter
@Akter wrote:
Hi ballardw,
Thank you so much😊. We are very close, only thing is that I see the variable name 'Area' appears in the output in 1st column immediate below the "All Areas" but i want '1' (merged cell) with number and percent next to it, exactly same as we have for Area 2 and 3 in column 1. (I don't want 'Area' again below the "Area Total' i want only '1' like 2 and 3 under "All Areas").
In addition, if i want to add another categorical variable (i.e age group) in column what should i do. Sorry I'm still trying to learn and have not used proc tabulate yet that much.
This is very efficient way to doing without writing lots of code. I really appreciate your help. Again thank you for your patience!
Second bit first:
Tabulate Table statement has dimensions which are separated by columns.
Table <some variables/statistics> , <= one comma the first set has the row
<some other variables or statistics> <= after the comma is the columns
;
Or a "3 dimension" report:
Table <some variables> , <= one comma the first set has the "page" or separate table
<some other variables or statistics> , <= after the comma is the Row
<some other variables or statistics> <= after the comma is the colums
;
So adding another variable after Race for example, would make that another category with associated columns.
You use the * to "nest" items: Area*(n colpctn) means that each level of Area gets nested statistics.
You could do Area*Race*(n colpctn) to get the results nested in a row.
Items, variables or statistics, in parentheses are "grouped".
Experiment.
Warning: Tabulate does not allow having a statistic cross another statistic.
There are enough options that SAS has published a book, not one of the largest, on Proc Tabulate.
The documentation is your friend.
The syntax in the table of: variable=' ' is how to suppress the label of a variable or statistic, or place text in the quotes to override the default. So if you want the values of Area to appear but not the variable name/label:
Area=' '*(n colpctn)
Because of the original way your "table" for output displayed it wasn't really clean.
@Akter wrote:
Dear All,
I want the below output table with count and % in separate rows instead of columns by row & columns variables. Percent (%) below count (n)
Data Have
Id Area Gender Race 1 1 1 2 2 2 2 3 3 3 1 2 4 2 2 . 5 1 2 3 6 3 1 1 7 2 2 2 8 3 1 2 9 1 1 1 10 3 2 3
Output want
Gender Race Total 1 2 1 2 3 Area total 10 5 5 2 4 3 Area1 3 2 1 1 1 1 30.0 40.0 20.0 50.0 25.0 33.3 Area2 3 0 3 0 1 1 30.0 0.0 60.0 0.0 25.0 33.3 Area3 4 3 1 1 2 1 40.0 60.00 20.00 50.00 50.00 33.33 Thank you in advance! Akter
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Please pay attention to the appearance of your desired output. The message windows on this forum will reformat stuff and the "as shown" version is almost certainly not what you "want".
Without a LOT of data manipulation this is the closest I can get to your "want" as I think it was meant:
Data Have; input Id Area Gender Race; datalines; 1 1 1 2 2 2 2 3 3 3 1 2 4 2 2 . 5 1 2 3 6 3 1 1 7 2 2 2 8 3 1 2 9 1 1 1 10 3 2 3 ; proc tabulate data=have; class area gender race/missing; table all='All areas' *n=' ' area*(n=' ' colpctn=' '), (all='Total' gender) race /row=float misstext='0' ; run;
Proc tabulate, while very flexible for somethings, will either not include any data that has missing for a CLASS variable or use the missing option to include it but a row/column heading with the missing value will appear.
Hi ballardw,
Thank you so much😊. We are very close, only thing is that I see the variable name 'Area' appears in the output in 1st column immediate below the "All Areas" but i want '1' (merged cell) with number and percent next to it, exactly same as we have for Area 2 and 3 in column 1. (I don't want 'Area' again below the "Area Total' i want only '1' like 2 and 3 under "All Areas").
In addition, if i want to add another categorical variable (i.e age group) in column what should i do. Sorry I'm still trying to learn and have not used proc tabulate yet that much.
This is very efficient way to doing without writing lots of code. I really appreciate your help. Again thank you for your patience!
@Akter wrote:
Hi ballardw,
Thank you so much😊. We are very close, only thing is that I see the variable name 'Area' appears in the output in 1st column immediate below the "All Areas" but i want '1' (merged cell) with number and percent next to it, exactly same as we have for Area 2 and 3 in column 1. (I don't want 'Area' again below the "Area Total' i want only '1' like 2 and 3 under "All Areas").
In addition, if i want to add another categorical variable (i.e age group) in column what should i do. Sorry I'm still trying to learn and have not used proc tabulate yet that much.
This is very efficient way to doing without writing lots of code. I really appreciate your help. Again thank you for your patience!
Second bit first:
Tabulate Table statement has dimensions which are separated by columns.
Table <some variables/statistics> , <= one comma the first set has the row
<some other variables or statistics> <= after the comma is the columns
;
Or a "3 dimension" report:
Table <some variables> , <= one comma the first set has the "page" or separate table
<some other variables or statistics> , <= after the comma is the Row
<some other variables or statistics> <= after the comma is the colums
;
So adding another variable after Race for example, would make that another category with associated columns.
You use the * to "nest" items: Area*(n colpctn) means that each level of Area gets nested statistics.
You could do Area*Race*(n colpctn) to get the results nested in a row.
Items, variables or statistics, in parentheses are "grouped".
Experiment.
Warning: Tabulate does not allow having a statistic cross another statistic.
There are enough options that SAS has published a book, not one of the largest, on Proc Tabulate.
The documentation is your friend.
The syntax in the table of: variable=' ' is how to suppress the label of a variable or statistic, or place text in the quotes to override the default. So if you want the values of Area to appear but not the variable name/label:
Area=' '*(n colpctn)
Because of the original way your "table" for output displayed it wasn't really clean.
Thank you ballarw for explaining in details, very helpful. I really appreciate your time and effort. 😊
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.