Solved: Re: Is there an alternative to proc tabulate for inclusion of rate cal...

bec1 · Posted 11-26-2024 03:15 PM

I'm working in SAS 9.4, and the following code is exactly what I'm trying to do, with the exception of the rate calculation because proc tabulate doesn't support mathematical operations (i.e., it runs without the inclusion of rate=num_deaths/pop*100000=''*f=8.1 which is not kosher code). My input dataset contains all combinations of demographic categories and their associated value for num_deaths and population. I need columns for years, under which would each have num_deaths, population, colpctn, and rate (num_deaths/pop*100000). The rows would contain each level of the demographic variables age, sex, and race_eth.

proc tabulate data=death_summary;
class year sex race_eth age;
var num_deaths pop rate;
table sex race_eth='Race/Ethnicity' age,
year=''*(num_deaths='Count'*sum=''*f=8. 
         num_deaths='%'*colpctsum=''*f=8.1 
         pop='Population'*sum=''*f=8.
         rate=num_deaths/pop*100000=''*f=8.1) / misstext='0' nocellmerge;
run;

I have tried proc report, but I'm failing at nesting columns, and depending on how I've structured it, it seems to prefer to summarize observations rather than num_deaths and population. I cannot calculate rate in death_summary, because I need the calculation to be dynamic after the proc tabulate groupings. I also intend to, once this runs, to add a "where" statement to filter by state, so that I can have a macro that runs it for each of the 50 US states.
Does anyone have any suggestions on how I could calculate rates while maintaining the groupings both by column years and row demographics?

Ksharp · Posted 11-27-2024 08:40 PM

data death_summary;
input year:$4. state_name:$21. SEX:$6. race_eth:$50. age:$2. num_deaths:32. pop:32. 
cause1:32. cause2:32. cause3:32. cause4:32. cause5:32. cause6:32. cause7:32. cause8:32.;
datalines;

2019 PA FEMALE A 01 2 10 2 0 0 0 0 0 0 0
2019 PA FEMALE A 02 4 9 4 0 0 0 0 0 0 0
2019 PA FEMALE A 03 0 11 0 0 0 0 0 0 0 0
2019 PA FEMALE A 04 2 6 2 0 0 0 0 0 0 0
2019 PA FEMALE A 05 0 7 0 0 0 0 0 0 0 0
2019 PA FEMALE B 01 5 5 4 0 0 0 0 0 0 0
2019 PA FEMALE B 02 11 8 0 0 11 0 0 0 0 0
2019 PA FEMALE B 03 6 12 10 0 0 0 0 6 0 0
2019 PA FEMALE B 04 2 33 1 0 0 0 0 1 0 0
2019 PA FEMALE B 05 0 5 0 0 0 0 0 0 0 0
2019 PA FEMALE W 01 2 45 2 0 0 0 0 0 0 0
2019 PA FEMALE W 02 4 6 0 0 0 0 3 0 0 0
2019 PA FEMALE W 03 6 9 0 0 0 0 6 0 0 0
2019 PA FEMALE W 04 7 12 0 0 0 0 0 7 0 0
2019 PA FEMALE W 05 1 22 1 0 0 0 0 0 0 0
2019 PA MALE A 01 6 4 0 0 0 0 6 0 0 0
2019 PA MALE A 02 10 14 6 0 0 5 0 0 0 0
2019 PA MALE A 03 7 11 0 0 0 0 7 0 0 0
2019 PA MALE A 04 4 5 0 0 0 3 0 0 0 1
2019 PA MALE A 05 0 6 0 0 0 0 0 0 0 0
2019 PA MALE B 01 10 5 4 0 0 0 0 0 0 0
2019 PA MALE B 02 16 9 0 0 1 10 0 0 0 0
2019 PA MALE B 03 12 12 7 0 0 0 0 6 0 8
2019 PA MALE B 04 1 16 0 0 0 0 0 1 0 0
2019 PA MALE B 05 3 8 0 0 0 3 0 0 0 0
2019 PA MALE W 01 0 7 0 0 0 0 0 0 0 0
2019 PA MALE W 02 5 12 0 0 0 0 1 0 0 0
2019 PA MALE W 03 0 5 0 0 0 0 0 0 0 0
2019 PA MALE W 04 0 44 0 10 0 0 0 0 0 0
2019 PA MALE W 05 9 23 9 0 0 0 0 0 0 0
;run;


proc sql;
create table report as
select 1 as id1,'SEX' as row1 length=80,1 as id2,sex as row2 length=80,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where sex is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where sex is not missing and year is not missing
   group by sex,year
union all

select 2,'Race/Ethnicity',2,race_eth,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where race_eth is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where race_eth is not missing and year is not missing
   group by race_eth,year
union all

select 3,'Age',3,age,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where age is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where age is not missing and year is not missing
   group by age,year
;
quit;


proc report data=report nowd;
columns id1 row1 id2 row2 year,(count per population rate);
define id1/group noprint;
define row1/group noprint;
define id2/group noprint;
define row2/group '';
define year/across '';
define count/analysis sum;
define per/analysis sum;
define population/analysis sum;
define rate/analysis sum;
compute before row1/style={just=l};
line row1 $80.;
endcomp;
run;

View solution in original post

PaigeMiller · Posted 11-26-2024 03:39 PM

Show us a portion of your data. (If the data is confidential, change it so it is fake data).

Please provide the data as working SAS data step code (instructions and examples). Do not provide the data as screen captures or as Excel.

--
Paige Miller

bec1 · Posted 11-27-2024 11:10 AM

data death_summary;
input year:$4. state_name:$21. SEX:$6. race_eth:$50. age:$2. num_deaths:32. pop:32. 
cause1:32. cause2:32. cause3:32. cause4:32. cause5:32. cause6:32. cause7:32. cause8:32.;
datalines;

2019 PA FEMALE A 01 2 10 2 0 0 0 0 0 0 0
2019 PA FEMALE A 02 4 9 4 0 0 0 0 0 0 0
2019 PA FEMALE A 03 0 11 0 0 0 0 0 0 0 0
2019 PA FEMALE A 04 2 6 2 0 0 0 0 0 0 0
2019 PA FEMALE A 05 0 7 0 0 0 0 0 0 0 0
2019 PA FEMALE B 01 5 5 4 0 0 0 0 0 0 0
2019 PA FEMALE B 02 11 8 0 0 11 0 0 0 0 0
2019 PA FEMALE B 03 6 12 10 0 0 0 0 6 0 0
2019 PA FEMALE B 04 2 33 1 0 0 0 0 1 0 0
2019 PA FEMALE B 05 0 5 0 0 0 0 0 0 0 0
2019 PA FEMALE W 01 2 45 2 0 0 0 0 0 0 0
2019 PA FEMALE W 02 4 6 0 0 0 0 3 0 0 0
2019 PA FEMALE W 03 6 9 0 0 0 0 6 0 0 0
2019 PA FEMALE W 04 7 12 0 0 0 0 0 7 0 0
2019 PA FEMALE W 05 1 22 1 0 0 0 0 0 0 0
2019 PA MALE A 01 6 4 0 0 0 0 6 0 0 0
2019 PA MALE A 02 10 14 6 0 0 5 0 0 0 0
2019 PA MALE A 03 7 11 0 0 0 0 7 0 0 0
2019 PA MALE A 04 4 5 0 0 0 3 0 0 0 1
2019 PA MALE A 05 0 6 0 0 0 0 0 0 0 0
2019 PA MALE B 01 10 5 4 0 0 0 0 0 0 0
2019 PA MALE B 02 16 9 0 0 1 10 0 0 0 0
2019 PA MALE B 03 12 12 7 0 0 0 0 6 0 8
2019 PA MALE B 04 1 16 0 0 0 0 0 1 0 0
2019 PA MALE B 05 3 8 0 0 0 3 0 0 0 0
2019 PA MALE W 01 0 7 0 0 0 0 0 0 0 0
2019 PA MALE W 02 5 12 0 0 0 0 1 0 0 0
2019 PA MALE W 03 0 5 0 0 0 0 0 0 0 0
2019 PA MALE W 04 0 44 0 10 0 0 0 0 0 0
2019 PA MALE W 05 9 23 9 0 0 0 0 0 0 0
;run;


proc tabulate data=death_summary;
     class year sex race_eth age;
     var num_deaths pop;
     table sex race_eth='Race/Ethnicity' age,
             year=''*(num_deaths='Count'*sum=''*f=8. 
                   num_deaths='%'*colpctsum=''*f=8.1
                   pop='Population'*sum=''*f=8.) 
             / misstext='0' nocellmerge;
run;

There are many more levels of states/age categories/race_eth, but this sample hopefully gives a reasonable impression. The proc tabulate is exactly what I'm going for, but I'd love there to be a column [num_deaths/pop*100000]. Thanks so much.

Quentin · Posted 11-26-2024 03:39 PM

You could calculate the rate in a DATA step or SQL step, then in tabulate use something like:

rate='rate'*mean=''*f=8.1

So if rate has the same value for every record within a year, then you can report the value with mean or min or max.

That said, if you can provide small example dataset (as a CARDS data step) and the desired output, it will be easier for people to help.

Patrick · Posted 11-26-2024 06:24 PM

You could try a compute block in Proc Report.

If you want actual code then please provide SAS code that creates sample data and show us the desired output using this sample data.

Below what Copilot suggested when asked to convert your Proc Tabulate to Proc Report (doesn't mean that's correct code but it's a start)

proc report data=death_summary nowd;
   column sex race_eth age year num_deaths pct_deaths pop rate;
   define sex / group;
   define race_eth / group 'Race/Ethnicity';
   define age / group;
   define year / across '';
   define num_deaths / sum 'Count' format=8.;
   define pct_deaths / computed '%' format=8.1;
   define pop / sum 'Population' format=8.;
   define rate / computed format=8.1;

   compute pct_deaths;
      pct_deaths = num_deaths.sum / pop.sum * 100;
   endcomp;

   compute rate;
      rate = num_deaths.sum / pop.sum * 100000;
   endcomp;

   rbreak after / summarize;
run;

bec1 · Posted 11-27-2024 11:13 AM

Thanks - I did some version of this, but it's counting rows and the number of rows is irrelevant to my dataset. I posted a sample dataset that hopefully gives a better idea. I'll keep tweaking proc report, thanks again.

Ksharp · Posted 11-26-2024 08:29 PM

I think PROC REPORT/PROC SQL can do that.
Post your sample data and desired output .

bec1 · Posted 11-27-2024 11:32 AM

I'm considering using proc sql, but maybe just because I'm more accustomed with sql syntax. I did post some sample data in another reply, thanks for your note.

Ksharp · Posted 11-27-2024 08:40 PM

data death_summary;
input year:$4. state_name:$21. SEX:$6. race_eth:$50. age:$2. num_deaths:32. pop:32. 
cause1:32. cause2:32. cause3:32. cause4:32. cause5:32. cause6:32. cause7:32. cause8:32.;
datalines;

2019 PA FEMALE A 01 2 10 2 0 0 0 0 0 0 0
2019 PA FEMALE A 02 4 9 4 0 0 0 0 0 0 0
2019 PA FEMALE A 03 0 11 0 0 0 0 0 0 0 0
2019 PA FEMALE A 04 2 6 2 0 0 0 0 0 0 0
2019 PA FEMALE A 05 0 7 0 0 0 0 0 0 0 0
2019 PA FEMALE B 01 5 5 4 0 0 0 0 0 0 0
2019 PA FEMALE B 02 11 8 0 0 11 0 0 0 0 0
2019 PA FEMALE B 03 6 12 10 0 0 0 0 6 0 0
2019 PA FEMALE B 04 2 33 1 0 0 0 0 1 0 0
2019 PA FEMALE B 05 0 5 0 0 0 0 0 0 0 0
2019 PA FEMALE W 01 2 45 2 0 0 0 0 0 0 0
2019 PA FEMALE W 02 4 6 0 0 0 0 3 0 0 0
2019 PA FEMALE W 03 6 9 0 0 0 0 6 0 0 0
2019 PA FEMALE W 04 7 12 0 0 0 0 0 7 0 0
2019 PA FEMALE W 05 1 22 1 0 0 0 0 0 0 0
2019 PA MALE A 01 6 4 0 0 0 0 6 0 0 0
2019 PA MALE A 02 10 14 6 0 0 5 0 0 0 0
2019 PA MALE A 03 7 11 0 0 0 0 7 0 0 0
2019 PA MALE A 04 4 5 0 0 0 3 0 0 0 1
2019 PA MALE A 05 0 6 0 0 0 0 0 0 0 0
2019 PA MALE B 01 10 5 4 0 0 0 0 0 0 0
2019 PA MALE B 02 16 9 0 0 1 10 0 0 0 0
2019 PA MALE B 03 12 12 7 0 0 0 0 6 0 8
2019 PA MALE B 04 1 16 0 0 0 0 0 1 0 0
2019 PA MALE B 05 3 8 0 0 0 3 0 0 0 0
2019 PA MALE W 01 0 7 0 0 0 0 0 0 0 0
2019 PA MALE W 02 5 12 0 0 0 0 1 0 0 0
2019 PA MALE W 03 0 5 0 0 0 0 0 0 0 0
2019 PA MALE W 04 0 44 0 10 0 0 0 0 0 0
2019 PA MALE W 05 9 23 9 0 0 0 0 0 0 0
;run;


proc sql;
create table report as
select 1 as id1,'SEX' as row1 length=80,1 as id2,sex as row2 length=80,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where sex is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where sex is not missing and year is not missing
   group by sex,year
union all

select 2,'Race/Ethnicity',2,race_eth,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where race_eth is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where race_eth is not missing and year is not missing
   group by race_eth,year
union all

select 3,'Age',3,age,year,
       sum(num_deaths) as Count,
	   100*calculated Count/(select sum(num_deaths) from death_summary where age is not missing and year=a.year) as per label='%' format=8.1,
       sum(pop) as Population,
	   100*calculated Count/calculated Population as rate label='rate(%)' format=8.1
 from death_summary as a
  where age is not missing and year is not missing
   group by age,year
;
quit;


proc report data=report nowd;
columns id1 row1 id2 row2 year,(count per population rate);
define id1/group noprint;
define row1/group noprint;
define id2/group noprint;
define row2/group '';
define year/across '';
define count/analysis sum;
define per/analysis sum;
define population/analysis sum;
define rate/analysis sum;
compute before row1/style={just=l};
line row1 $80.;
endcomp;
run;

bec1 · Posted 12-03-2024 11:08 AM

Oh my goodness, thanks so much. I absolutely never would've figured it out on my own, I was all over the place. Much appreciated.

Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Re: Is there an alternative to proc tabulate for inclusion of rate calculations in the table?

Registration is open

SAS Training: Just a Click Away