Hi all,
I have a question about dividing counts of values generated from proc freq. I have a dataset that looks like this:
data my_data;
input id type$ option1 option2 option3;
cards;
1 A 1 0 2
2 A 1 1 1
3 B 0 3 2
4 B 2 0 0
5 B 0 1 2
;
run;
I need to do this same calculation for EACH option: (count of number of 1's) / (count of number of >=1's). The denominator does not include the 0 values.
So for option1, I would want some kind of output that tells me 2/3. Option2 = 2/3. Option3=1/4.
In addition to generating these percentages for the overall cohort of individuals, I also want to be able to generate these values BY the "type" variable (for type=A only and type=B only).
I've tried playing around with proc freq with an out= statement, as well as proc report, but I can't think of an easy way to generate these numbers.
Thank you in advance for your help!
data my_data;
input id type$ option1 option2 option3;
cards;
1 A 1 0 2
2 A 1 1 1
3 B 0 3 2
4 B 2 0 0
5 B 0 1 2
;
run;
proc transpose data=my_data out=temp;
by id ;
var option1-option3;
run;
proc sql;
create table want as
select _name_ as option label=' ', sum(col1=1)/sum(col1>=1) as ratio
from temp
group by option;
quit;
option | ratio |
---|---|
option1 | 0.666667 |
option2 | 0.666667 |
option3 | 0.25 |
data my_data;
input id type$ option1 option2 option3;
cards;
1 A 1 0 2
2 A 1 1 1
3 B 0 3 2
4 B 2 0 0
5 B 0 1 2
;
run;
proc transpose data=my_data out=temp;
by id ;
var option1-option3;
run;
proc sql;
create table want as
select _name_ as option label=' ', sum(col1=1)/sum(col1>=1) as ratio
from temp
group by option;
quit;
option | ratio |
---|---|
option1 | 0.666667 |
option2 | 0.666667 |
option3 | 0.25 |
Thank you so much, this worked perfectly!! Would it be possible to add two more columns for the ratio for individuals with type=A and the ratio for type=B?
Hi @novinosrin . Sure, I was thinking something like this
Option | Overall_ratio | TypeA_ratio | TypeB_ratio |
Option1 | 0.666667 | 1 | 0 |
Option2 | 0.666667 | 1 | 0.5 |
Option3 | 0.25 | 0.5 | 0 |
Alternatively, I could use the existing code you provided (to calculate the overall_ratio) and replicate it in two subsetted datasets (for type=A and type=B individuals).
Thank you again for your help!
data my_data;
input id type$ option1 option2 option3;
cards;
1 A 1 0 2
2 A 1 1 1
3 B 0 3 2
4 B 2 0 0
5 B 0 1 2
;
run;
proc transpose data=my_data out=temp;
by id type;
var option1-option3;
run;
proc sql;
create table temp1 as
select a.*, Overall_ratio
from
(select type,_name_ as option label=' ', sum(col1=1)/sum(col1>=1) as ratio
from temp
group by type,option) a
inner join
(select _name_ as option label=' ', sum(col1=1)/sum(col1>=1) as Overall_ratio
from temp
group by option)b
on a.option=b.option
order by option,type;
quit;
proc transpose data=temp1 out=want(drop=_name_) prefix=Type suffix=_ratio;
by option overall_ratio;
var ratio;
id type;
run;
@novinosrin thank you!! Couldn't quite get it to work, but the original code is extremely helpful and is much easier than the alternative method I was using before. Thank you for all of your time and assistance!
Hi again, The above SQL can be made terse by taking advantage of the automatic remerge that the SQL optimizer utlizes. Guru @hashman uses auto remerge a lot as opposed to explicit join perhaps he prefers to avoid the verbosity. My knowlege is a drop in his ocean though I'm good in plagiarizing his solutions. So his SQL autoremerge approach would likely be
proc sql;
create table temp1 as
select _name_ as option label=' ',type,Overall_ratio, sum(col1=1)/sum(col1>=1) as ratio
from
(select *, sum(col1=1)/sum(col1>=1) as Overall_ratio
from temp
group by option)
group by option,type,Overall_ratio
order by option,type;
quit;
Yet another dorfmanism that performs surgery in sas.
data my_data;
input id type$ option1 option2 option3;
cards;
1 A 1 0 2
2 A 1 1 1
3 B 0 3 2
4 B 2 0 0
5 B 0 1 2
;
run;
proc transpose data=my_data out=temp;
by id type;
var option1-option3;
run;
proc sql;
create table temp1 as
select _name_ as option label=' ',type,Overall_ratio, sum(col1=1)/sum(col1>=1) as ratio
from
(select *, sum(col1=1)/sum(col1>=1) as Overall_ratio
from temp
group by option)
group by option,type,Overall_ratio
order by option,type;
quit;
proc transpose data=temp1 out=want1(drop=_name_) prefix=Type suffix=_ratio;
by option overall_ratio;
var ratio;
id type;
run;
Thank you!! I'll give this a try!
If you recode your data such that 1 is 1, 0 are missing and anything >1 is 0 then this can be done with proc tabulate or report quite nicely.
Thanks @Reeza! Clever solution, wish I thought of it myself!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.