Hello all,
Is there a way to generate means and frequency procedure for age group stratified by gender at the same time. I am using the program below. Its only generating the 'n' for the age category. I tried to include 'colpctn' but still could'nt get the required output.
data have;
input id sex$ race$ age agecat$;
datalines;
01 F W 66 60-70
02 F B 87 80-90
03 M W 63 60-70
04 M A 79 70-80
05 M W 75 70-80
06 M U 72 70-80
07 F W 66 60-70
08 F B 87 80-90
09 M W 63 60-70
10 M A 79 70-80
11 M W 75 70-80
12 M U 72 70-80
;
proc format;
value age 18-54= '18-54 years'
55-69= '55-69 years'
70-high= '70+ years';
value $ sex
M = 'Males'
F = 'Females';
run;
proc tabulate data=have;
class sex agecat;
var age;
format age age.;
table
(agecat= 'Age Category' all) *[style=[just=C cellwidth=95 ] ]
age= 'Age'*
(n = 'N' * [style=[just=C cellwidth=95 ] ]
Mean = 'Mean' * [style=[just=C cellwidth=95 ] f=8.1]
Std = 'SD' * [style=[just=C cellwidth=95 ] ]
Median = 'Med' * [style=[just=C cellwidth=95 ] f=8.1]
Min = 'Min' * [style=[just=C cellwidth=95 ] ]
Max = 'Max' * [style=[just=C cellwidth=95 ] ]),
(sex= '' all= 'All');
format sex $sex. age age.;
run;
I am looking for the output as below:
Females | Males | All | |||||
age cat | N | Column percent | N | Column percent | N | Column percent | |
60-70 | 2 | 50 | 2 | 25 | 4 | 33.3 | |
70-80 | . | 6 | 75 | 6 | 50.0 | ||
80-90 | 2 | 50 | . | 2 | 16.7 | ||
All | 4 | 100 | 8 | 100 | 12 | 100.0 | |
Age | N | 4 | 8 | 12 | |||
Mean | 76.5 | 72.3 | 73.7 | ||||
SD | 12.12 | 6.3 | 8.35 | ||||
Med | 76.5 | 73.5 | 73.5 | ||||
Min | 66 | 63 | 63 | ||||
Max | 87 | 79 | 87 |
Proc Tabulate is not going to allow crossing statistics. In your example the "colpctn" would be attempting to cross with the statistics mean, std etc.
You can get close with two table statements.
proc tabulate data=have; class sex agecat; var age; format age age.; table (agecat= 'Age Category' all), (sex= '' all= 'All')*(n colpctn) /misstext=' ' ; table age= 'Age'* (n = 'N' * [style=[just=C cellwidth=95 ] ] Mean = 'Mean' * [style=[just=C cellwidth=95 ] f=8.1] Std = 'SD' * [style=[just=C cellwidth=95 ] ] Median = 'Med' * [style=[just=C cellwidth=95 ] f=8.1] Min = 'Min' * [style=[just=C cellwidth=95 ] ] Max = 'Max' * [style=[just=C cellwidth=95 ] ]), (sex= '' all= 'All') ; format sex $sex. age age.; run;
If you absolutely must have a single table with that appearance I am afraid that you will 1) have to presummarize all the data and 2) look into either a data step and severe recoding and Proc Print or Report or the Report Writing Interface (a different use of the data step).
Note that the above proc tabulate code would create the summarized data with the use of an OUT=<datasetname> option on the proc tabulate statement. Though it takes awhile to work through the logic of the dataset structure to identify the bits you want. You would be creating your own row heading variables conditionally based on the table (a variable in the output).
Proc Tabulate is not going to allow crossing statistics. In your example the "colpctn" would be attempting to cross with the statistics mean, std etc.
You can get close with two table statements.
proc tabulate data=have; class sex agecat; var age; format age age.; table (agecat= 'Age Category' all), (sex= '' all= 'All')*(n colpctn) /misstext=' ' ; table age= 'Age'* (n = 'N' * [style=[just=C cellwidth=95 ] ] Mean = 'Mean' * [style=[just=C cellwidth=95 ] f=8.1] Std = 'SD' * [style=[just=C cellwidth=95 ] ] Median = 'Med' * [style=[just=C cellwidth=95 ] f=8.1] Min = 'Min' * [style=[just=C cellwidth=95 ] ] Max = 'Max' * [style=[just=C cellwidth=95 ] ]), (sex= '' all= 'All') ; format sex $sex. age age.; run;
If you absolutely must have a single table with that appearance I am afraid that you will 1) have to presummarize all the data and 2) look into either a data step and severe recoding and Proc Print or Report or the Report Writing Interface (a different use of the data step).
Note that the above proc tabulate code would create the summarized data with the use of an OUT=<datasetname> option on the proc tabulate statement. Though it takes awhile to work through the logic of the dataset structure to identify the bits you want. You would be creating your own row heading variables conditionally based on the table (a variable in the output).
Thank you, Ballardw. Your program was helpful. I wonder if there is a way to generate cross statistics without using proc tabulate.
data have;
input id sex$ race$ age agecat$;
datalines;
01 F W 66 60-70
02 F B 87 80-90
03 M W 63 60-70
04 M A 79 70-80
05 M W 75 70-80
06 M U 72 70-80
07 F W 66 60-70
08 F B 87 80-90
09 M W 63 60-70
10 M A 79 70-80
11 M W 75 70-80
12 M U 72 70-80
;
proc format;
value age 18-54= '18-54 years'
55-69= '55-69 years'
70-high= '70+ years';
value $ sex
M = 'Males'
F = 'Females';
run;
The output I would like to have:
Descriptives | Overall Cohort | Males | Females | |||
Age | ||||||
Mean | Standard Deviation | Mean | Standard Deviation | Mean | Standard Deviation | |
Minimum | Maximum | Minimum | Maximum | Minimum | Maximum | |
25% IQR | 75% IQR | 25% IQR | 75% IQR | 25% IQR | 75% IQR | |
Age Group | ||||||
60-70 years | N | % | N | % | N | % |
70-80 years | N | % | N | % | N | % |
80-90 years | N | % | N | % | N | % |
Yes, I've done that. And generated printer markup code to make "table" appearances that weren't available in procedures (circa 1987)
The Report Writing Interface for the data step likely could do this into a single table. You would have to presummarize all of the data and calculate the rates you want then use the interface to display the values in the order you want.
Considering the amount of work I know that would be I can use the proc tabulate approach and remove a blank line unless there are MANY of these tables to generate.
Sometimes it is worth talking to the consumer of the report about how fanatical the results have to match a proposed layout.
If I show someone the two table proc tabulate approach and demonstrate we can make multiple sets of these easily (program time per set) and then document the approaches needed to exactly match a layout and discuss programmer time*pay and response time it might be amazing how quickly "pretty close" gets excepted. If the user is willing to pay the extra time and programming costs for "exact" then happily bill them for the time.
I worked in one shop that had a customer that wanted some report tables in a specific layout. They had developed a process to create them users features of their then current reporting software. The next "upgrade" removed some of the features used to create the customer's table layout. They maintained one computer with the old version of the software just to create that report appearance. Then a Microsoft operating system upgrade came along an the old software no longer worked. After a discussion with the client a new table layout that closely resembled the old one was accepted. Which could have saved LOTS of billable hours over the course of several years if the proposal had been discussed earlier. (Actual the bills wouldn't have gone down so the effect would have been more profit on our side. 🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.