Hi everyone,
I explain what I want to do using the simplified table below. I would like to total the number of patients for each age group across all sites for each category of region. Taking table B from table A as an example. Any guidance on how to do it would be greatly appreciated. Thank you.
A
A: | |||
SiteID | Age_group | Region | Num_Patients |
Site01 | 18_25 | NA | 500 |
Site01 | 18_25 | MW | 703 |
Site01 | 18_25 | Miss | 500 |
Site01 | 18_25 | NE | 507 |
Site01 | 18_25 | Other | 500 |
Site01 | 18_25 | S | 1239 |
Site01 | 18_25 | W | 541 |
Site01 | 26_44 | NA | 502 |
Site01 | 26_44 | MW | 1087 |
Site01 | 26_44 | Miss | 500 |
Site01 | 26_44 | NE | 604 |
Site01 | 26_44 | Other | 500 |
Site01 | 26_44 | S | 2763 |
Site01 | 26_44 | W | 685 |
Site02 | 18_25 | NA | 533 |
Site02 | 18_25 | MW | 4278 |
Site02 | 18_25 | Miss | 500 |
Site02 | 18_25 | NE | 1964 |
Site02 | 18_25 | Other | 500 |
Site02 | 18_25 | S | 7340 |
Site02 | 18_25 | W | 3242 |
Site02 | 26_44 | NA | 566 |
Site02 | 26_44 | MW | 5920 |
Site02 | 26_44 | Miss | 500 |
Site02 | 26_44 | NE | 2512 |
Site02 | 26_44 | Other | 501 |
Site02 | 26_44 | S | 11502 |
Site02 | 26_44 | W | 5065 |
B
B: | |||
SiteID | Age_group | Region | Num_Patients |
ALL | 18_25 | NA | 1033 |
ALL | 18_25 | MW | 4981 |
ALL | 18_25 | Miss | 1000 |
ALL | 18_25 | NE | 2471 |
ALL | 18_25 | Other | 1000 |
ALL | 18_25 | S | 8579 |
ALL | 18_25 | W | 3783 |
ALL | 26_44 | NA | 1068 |
ALL | 26_44 | MW | 7007 |
ALL | 26_44 | Miss | 1000 |
ALL | 26_44 | NE | 3116 |
ALL | 26_44 | Other | 1001 |
ALL | 26_44 | S | 14265 |
ALL | 26_44 | W | 5750 |
data have;
infile datalines dlm=',';
input SiteID $ Age_group $ Region $ Num_Patients;
datalines;
Site01,18_25,NA,500
Site01,18_25,MW,703
Site01,18_25,Miss,500
Site01,18_25,NE,507
Site01,18_25,Other,500
Site01,18_25,S,1239
Site01,18_25,W,541
Site01,26_44,NA,502
Site01,26_44,MW,1087
Site01,26_44,Miss,500
Site01,26_44,NE,604
Site01,26_44,Other,500
Site01,26_44,S,2763
Site01,26_44,W,685
Site02,18_25,NA,533
Site02,18_25,MW,4278
Site02,18_25,Miss,500
Site02,18_25,NE,1964
Site02,18_25,Other,500
Site02,18_25,S,7340
Site02,18_25,W,3242
Site02,26_44,NA,566
Site02,26_44,MW,5920
Site02,26_44,Miss,500
Site02,26_44,NE,2512
Site02,26_44,Other,501
Site02,26_44,S,11502
Site02,26_44,W,5065
;
run;
proc sql;
create table want as
select "ALL" as siteid
,age_group
,region
,sum(Num_Patients) as Num_Patients
from have
group by age_group,region
order by age_group;
quit;
Can you try this?
Try PROC SUMMARY, PROC FREQ, PROC REPORT and PROC TABULATE. All of those should be able to make this report.
An example of a report with the desired elements though different appearance using Proc Tabulate:
proc tabulate data=have out=tabout; class SiteID Age_group Region ; var Num_Patients; table (All='All Sites' SiteId)*age_group*region, Num_patients*sum='' ; run;
Tabulate doesn't repeat row or column heading that are groups of related values.
The ALL instruction is the way to get a group total across other values of the variable. I tend to use something a bit more than just "All" for labels so I can tell all of what...
Note that proc tabulate can produce multiple tables in one procedure call such as:
proc tabulate data=have ; class SiteID Age_group Region ; var Num_Patients; table (All='All Sites' SiteId)*age_group*region, Num_patients*sum='' ; table (All='All Ages' age_group )*SiteId*region, Num_patients*sum='' ; run;
which is one reason I'm a bit more verbose with the ALL label.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.