BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
bec1
Calcite | Level 5

I'm working in SAS 9.4, and the following code is exactly what I'm trying to do, with the exception of the rate calculation because proc tabulate doesn't support mathematical operations (i.e., it runs without the inclusion of rate=num_deaths/pop*100000=''*f=8.1 which is not kosher code). My input dataset contains all combinations of demographic categories and their associated value for num_deaths and population. I need columns for years, under which would each have num_deaths, population, colpctn, and rate (num_deaths/pop*100000). The rows would contain each level of the demographic variables age, sex, and race_eth.

 

proc tabulate data=death_summary;
class year sex race_eth age;
var num_deaths pop rate;
table sex race_eth='Race/Ethnicity' age,
year=''*(num_deaths='Count'*sum=''*f=8. 
num_deaths='%'*colpctsum=''*f=8.1
pop='Population'*sum=''*f=8. rate=num_deaths/pop*100000=''*f=8.1) / misstext='0' nocellmerge; run;


I have tried proc report, but I'm failing at nesting columns, and depending on how I've structured it, it seems to prefer to summarize observations rather than num_deaths and population. I cannot calculate rate in death_summary, because I need the calculation to be dynamic after the proc tabulate groupings. I also intend to, once this runs, to add a "where" statement to filter by state, so that I can have a macro that runs it for each of the 50 US states.
Does anyone have any suggestions on how I could calculate rates while maintaining the groupings both by column years and row demographics?



1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data death_summary;
input year:$4. state_name:$21. SEX:$6. race_eth:$50. age:$2. num_deaths:32. pop:32. 
cause1:32. cause2:32. cause3:32. cause4:32. cause5:32. cause6:32. cause7:32. cause8:32.;
datalines;

2019 PA FEMALE A 01 2 10 2 0 0 0 0 0 0 0
2019 PA FEMALE A 02 4 9 4 0 0 0 0 0 0 0
2019 PA FEMALE A 03 0 11 0 0 0 0 0 0 0 0
2019 PA FEMALE A 04 2 6 2 0 0 0 0 0 0 0
2019 PA FEMALE A 05 0 7 0 0 0 0 0 0 0 0
2019 PA FEMALE B 01 5 5 4 0 0 0 0 0 0 0
2019 PA FEMALE B 02 11 8 0 0 11 0 0 0 0 0
2019 PA FEMALE B 03 6 12 10 0 0 0 0 6 0 0
2019 PA FEMALE B 04 2 33 1 0 0 0 0 1 0 0
2019 PA FEMALE B 05 0 5 0 0 0 0 0 0 0 0
2019 PA FEMALE W 01 2 45 2 0 0 0 0 0 0 0
2019 PA FEMALE W 02 4 6 0 0 0 0 3 0 0 0
2019 PA FEMALE W 03 6 9 0 0 0 0 6 0 0 0
2019 PA FEMALE W 04 7 12 0 0 0 0 0 7 0 0
2019 PA FEMALE W 05 1 22 1 0 0 0 0 0 0 0
2019 PA MALE A 01 6 4 0 0 0 0 6 0 0 0
2019 PA MALE A 02 10 14 6 0 0 5 0 0 0 0
2019 PA MALE A 03 7 11 0 0 0 0 7 0 0 0
2019 PA MALE A 04 4 5 0 0 0 3 0 0 0 1
2019 PA MALE A 05 0 6 0 0 0 0 0 0 0 0
2019 PA MALE B 01 10 5 4 0 0 0 0 0 0 0
2019 PA MALE B 02 16 9 0 0 1 10 0 0 0 0
2019 PA MALE B 03 12 12 7 0 0 0 0 6 0 8
2019 PA MALE B 04 1 16 0 0 0 0 0 1 0 0
2019 PA MALE B 05 3 8 0 0 0 3 0 0 0 0
2019 PA MALE W 01 0 7 0 0 0 0 0 0 0 0
2019 PA MALE W 02 5 12 0 0 0 0 1 0 0 0
2019 PA MALE W 03 0 5 0 0 0 0 0 0 0 0
2019 PA MALE W 04 0 44 0 10 0 0 0 0 0 0
2019 PA MALE W 05 9 23 9 0 0 0 0 0 0 0
;run;


proc sql;
create table report as
select 1 as id1,'SEX' as row1 length=80,1 as id2,sex as row2 length=80,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where sex is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where sex is not missing and year is not missing
   group by sex,year
union all

select 2,'Race/Ethnicity',2,race_eth,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where race_eth is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where race_eth is not missing and year is not missing
   group by race_eth,year
union all

select 3,'Age',3,age,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where age is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where age is not missing and year is not missing
   group by age,year
;
quit;


proc report data=report nowd;
columns id1 row1 id2 row2 year,(count per population rate);
define id1/group noprint;
define row1/group noprint;
define id2/group noprint;
define row2/group '';
define year/across '';
define count/analysis sum;
define per/analysis sum;
define population/analysis sum;
define rate/analysis sum;
compute before row1/style={just=l};
line row1 $80.;
endcomp;
run;

Ksharp_0-1732758012901.png

 

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

Show us a portion of your data. (If the data is confidential, change it so it is fake data).

 

Please provide the data as working SAS data step code (instructions and examples). Do not provide the data as screen captures or as Excel.

--
Paige Miller
bec1
Calcite | Level 5
data death_summary;
input year:$4. state_name:$21. SEX:$6. race_eth:$50. age:$2. num_deaths:32. pop:32. 
cause1:32. cause2:32. cause3:32. cause4:32. cause5:32. cause6:32. cause7:32. cause8:32.;
datalines;

2019 PA FEMALE A 01 2 10 2 0 0 0 0 0 0 0
2019 PA FEMALE A 02 4 9 4 0 0 0 0 0 0 0
2019 PA FEMALE A 03 0 11 0 0 0 0 0 0 0 0
2019 PA FEMALE A 04 2 6 2 0 0 0 0 0 0 0
2019 PA FEMALE A 05 0 7 0 0 0 0 0 0 0 0
2019 PA FEMALE B 01 5 5 4 0 0 0 0 0 0 0
2019 PA FEMALE B 02 11 8 0 0 11 0 0 0 0 0
2019 PA FEMALE B 03 6 12 10 0 0 0 0 6 0 0
2019 PA FEMALE B 04 2 33 1 0 0 0 0 1 0 0
2019 PA FEMALE B 05 0 5 0 0 0 0 0 0 0 0
2019 PA FEMALE W 01 2 45 2 0 0 0 0 0 0 0
2019 PA FEMALE W 02 4 6 0 0 0 0 3 0 0 0
2019 PA FEMALE W 03 6 9 0 0 0 0 6 0 0 0
2019 PA FEMALE W 04 7 12 0 0 0 0 0 7 0 0
2019 PA FEMALE W 05 1 22 1 0 0 0 0 0 0 0
2019 PA MALE A 01 6 4 0 0 0 0 6 0 0 0
2019 PA MALE A 02 10 14 6 0 0 5 0 0 0 0
2019 PA MALE A 03 7 11 0 0 0 0 7 0 0 0
2019 PA MALE A 04 4 5 0 0 0 3 0 0 0 1
2019 PA MALE A 05 0 6 0 0 0 0 0 0 0 0
2019 PA MALE B 01 10 5 4 0 0 0 0 0 0 0
2019 PA MALE B 02 16 9 0 0 1 10 0 0 0 0
2019 PA MALE B 03 12 12 7 0 0 0 0 6 0 8
2019 PA MALE B 04 1 16 0 0 0 0 0 1 0 0
2019 PA MALE B 05 3 8 0 0 0 3 0 0 0 0
2019 PA MALE W 01 0 7 0 0 0 0 0 0 0 0
2019 PA MALE W 02 5 12 0 0 0 0 1 0 0 0
2019 PA MALE W 03 0 5 0 0 0 0 0 0 0 0
2019 PA MALE W 04 0 44 0 10 0 0 0 0 0 0
2019 PA MALE W 05 9 23 9 0 0 0 0 0 0 0
;run;


proc tabulate data=death_summary;
     class year sex race_eth age;
     var num_deaths pop;
     table sex race_eth='Race/Ethnicity' age,
             year=''*(num_deaths='Count'*sum=''*f=8. 
                   num_deaths='%'*colpctsum=''*f=8.1
                   pop='Population'*sum=''*f=8.) 
             / misstext='0' nocellmerge;
run;

There are many more levels of states/age categories/race_eth, but this sample hopefully gives a reasonable impression. The proc tabulate is exactly what I'm going for, but I'd love there to be a column [num_deaths/pop*100000].  Thanks so much.

Quentin
Super User

You could calculate the rate in a DATA step or SQL step, then in tabulate use something like:

rate='rate'*mean=''*f=8.1

So if rate has the same value for every record within a year, then you can report the value with mean or min or max.  

 

That said, if you can provide small example dataset (as a CARDS data step) and the desired output, it will be easier for people to help.

The Boston Area SAS Users Group is hosting free webinars!
Next webinar will be in January 2025. Until then, check out our archives: https://www.basug.org/videos. And be sure to subscribe to our our email list.
Patrick
Opal | Level 21

You could try a compute block in Proc Report. 

If you want actual code then please provide SAS code that creates sample data and show us the desired output using this sample data.

 

Below what Copilot suggested when asked to convert your Proc Tabulate to Proc Report (doesn't mean that's correct code but it's a start)

proc report data=death_summary nowd;
   column sex race_eth age year num_deaths pct_deaths pop rate;
   define sex / group;
   define race_eth / group 'Race/Ethnicity';
   define age / group;
   define year / across '';
   define num_deaths / sum 'Count' format=8.;
   define pct_deaths / computed '%' format=8.1;
   define pop / sum 'Population' format=8.;
   define rate / computed format=8.1;

   compute pct_deaths;
      pct_deaths = num_deaths.sum / pop.sum * 100;
   endcomp;

   compute rate;
      rate = num_deaths.sum / pop.sum * 100000;
   endcomp;

   rbreak after / summarize;
run;
bec1
Calcite | Level 5

Thanks - I did some version of this, but it's counting rows and the number of rows is irrelevant to my dataset. I posted a sample dataset that hopefully gives a better idea. I'll keep tweaking proc report, thanks again.

Ksharp
Super User
I think PROC REPORT/PROC SQL can do that.
Post your sample data and desired output .
bec1
Calcite | Level 5

I'm considering using proc sql, but maybe just because I'm more accustomed with sql syntax. I did post some sample data in another reply, thanks for your note.

Ksharp
Super User
data death_summary;
input year:$4. state_name:$21. SEX:$6. race_eth:$50. age:$2. num_deaths:32. pop:32. 
cause1:32. cause2:32. cause3:32. cause4:32. cause5:32. cause6:32. cause7:32. cause8:32.;
datalines;

2019 PA FEMALE A 01 2 10 2 0 0 0 0 0 0 0
2019 PA FEMALE A 02 4 9 4 0 0 0 0 0 0 0
2019 PA FEMALE A 03 0 11 0 0 0 0 0 0 0 0
2019 PA FEMALE A 04 2 6 2 0 0 0 0 0 0 0
2019 PA FEMALE A 05 0 7 0 0 0 0 0 0 0 0
2019 PA FEMALE B 01 5 5 4 0 0 0 0 0 0 0
2019 PA FEMALE B 02 11 8 0 0 11 0 0 0 0 0
2019 PA FEMALE B 03 6 12 10 0 0 0 0 6 0 0
2019 PA FEMALE B 04 2 33 1 0 0 0 0 1 0 0
2019 PA FEMALE B 05 0 5 0 0 0 0 0 0 0 0
2019 PA FEMALE W 01 2 45 2 0 0 0 0 0 0 0
2019 PA FEMALE W 02 4 6 0 0 0 0 3 0 0 0
2019 PA FEMALE W 03 6 9 0 0 0 0 6 0 0 0
2019 PA FEMALE W 04 7 12 0 0 0 0 0 7 0 0
2019 PA FEMALE W 05 1 22 1 0 0 0 0 0 0 0
2019 PA MALE A 01 6 4 0 0 0 0 6 0 0 0
2019 PA MALE A 02 10 14 6 0 0 5 0 0 0 0
2019 PA MALE A 03 7 11 0 0 0 0 7 0 0 0
2019 PA MALE A 04 4 5 0 0 0 3 0 0 0 1
2019 PA MALE A 05 0 6 0 0 0 0 0 0 0 0
2019 PA MALE B 01 10 5 4 0 0 0 0 0 0 0
2019 PA MALE B 02 16 9 0 0 1 10 0 0 0 0
2019 PA MALE B 03 12 12 7 0 0 0 0 6 0 8
2019 PA MALE B 04 1 16 0 0 0 0 0 1 0 0
2019 PA MALE B 05 3 8 0 0 0 3 0 0 0 0
2019 PA MALE W 01 0 7 0 0 0 0 0 0 0 0
2019 PA MALE W 02 5 12 0 0 0 0 1 0 0 0
2019 PA MALE W 03 0 5 0 0 0 0 0 0 0 0
2019 PA MALE W 04 0 44 0 10 0 0 0 0 0 0
2019 PA MALE W 05 9 23 9 0 0 0 0 0 0 0
;run;


proc sql;
create table report as
select 1 as id1,'SEX' as row1 length=80,1 as id2,sex as row2 length=80,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where sex is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where sex is not missing and year is not missing
   group by sex,year
union all

select 2,'Race/Ethnicity',2,race_eth,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where race_eth is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where race_eth is not missing and year is not missing
   group by race_eth,year
union all

select 3,'Age',3,age,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where age is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where age is not missing and year is not missing
   group by age,year
;
quit;


proc report data=report nowd;
columns id1 row1 id2 row2 year,(count per population rate);
define id1/group noprint;
define row1/group noprint;
define id2/group noprint;
define row2/group '';
define year/across '';
define count/analysis sum;
define per/analysis sum;
define population/analysis sum;
define rate/analysis sum;
compute before row1/style={just=l};
line row1 $80.;
endcomp;
run;

Ksharp_0-1732758012901.png

 

bec1
Calcite | Level 5
Oh my goodness, thanks so much. I absolutely never would've figured it out on my own, I was all over the place. Much appreciated.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1343 views
  • 3 likes
  • 5 in conversation