DATA Step, Macro, Functions and more

Calculating ratio for differenet values in a variable by group

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Calculating ratio for differenet values in a variable by group

[ Edited ]

Hello.

I have a problem in calculation ratio for diffierent values in a variable by group.

For example, I have a dataset like this.

 

iddatenamevar1
1161231ax
1161231ay
1161231by
1161231cx
1161231cy
1161231dz
2161231ex
2161231ey
2161231fx
2161231fz
2161231gz

 

I want to calculate (number of a value in var1)/(number of distinct names) by id-date groups.

For example, for id 1 - date 161231 group, I want x_ratio=2/4 y_ratio=3/4 z_ratio=1/4.

Also, for id 2 - date 161231 group, I want x_ratio=2/3 y_ratio=1/3 z_ratio=2/3.

 

If you have any suggestion, please share it with me.

Thank you.


Accepted Solutions
Solution
‎03-18-2018 11:02 PM
Contributor
Posts: 62

Re: Calculating ratio for differenet values in a variable by group

[ Edited ]

One option, plus the code for transpose in case you want the ratios in separate columns

 

proc sql;
create table want as
select a.id,a.date,var1,nvalInVar1 ,nDistinctNames,nvalInVar1/nDistinctNames as Ratio
 from (select id,date,var1,count (*) as nvalInVar1 from have group by id,date,var1) a
join (select id,date,count(distinct(name)) as nDistinctNames from have group by id,date) b
on a.id=b.id and a.date=b.date;

proc transpose data=want out=want2 (drop=_name_) prefix=ratio_; 
by id date; 
id var1 ;
var ratio;
 run; 

 

View solution in original post


All Replies
Respected Advisor
Posts: 2,833

Re: Calculating ratio for differenet values in a variable by group

I can't understand how when id=1, z_ratio=1/4 from your data. You need to explain in more detail how these ratios are computed.

--
Paige Miller
Super User
Posts: 6,638

Re: Calculating ratio for differenet values in a variable by group

Posted in reply to PaigeMiller

@PaigeMiller,

 

I don't have time to solve this right now, but I can explain where the numbers come from.

 

The denominator is 4 because there are 4 distinct NAME values for the combination of ID and DATE.

 

The numerator is 1 because there is only 1 observation for the combination of ID, DATE, and VAR1.

Solution
‎03-18-2018 11:02 PM
Contributor
Posts: 62

Re: Calculating ratio for differenet values in a variable by group

[ Edited ]

One option, plus the code for transpose in case you want the ratios in separate columns

 

proc sql;
create table want as
select a.id,a.date,var1,nvalInVar1 ,nDistinctNames,nvalInVar1/nDistinctNames as Ratio
 from (select id,date,var1,count (*) as nvalInVar1 from have group by id,date,var1) a
join (select id,date,count(distinct(name)) as nDistinctNames from have group by id,date) b
on a.id=b.id and a.date=b.date;

proc transpose data=want out=want2 (drop=_name_) prefix=ratio_; 
by id date; 
id var1 ;
var ratio;
 run; 

 

PROC Star
Posts: 1,593

Re: Calculating ratio for differenet values in a variable by group

data have;
input id date	name $	var1 $;
datalines;
1	161231	a	x
1	161231	a	y
1	161231	b	y
1	161231	c	x
1	161231	c	y
1	161231	d	z
2	161231	e	x
2	161231	e	y
2	161231	f	x
2	161231	f	z
2	161231	g	z
;
proc freq data=have  noprint ;
by id date;
tables var1/out=temp(drop=percent) ;
run; 


proc sql;
create table want as
select a.*,b.cnt, a.count/b.cnt as ratio
from 
temp a left join (select id, date,count(distinct name) as cnt from have group by id, date) b
on a.id=b.id and a.date=b.date;
quit;
Contributor
Posts: 22

Re: Calculating ratio for differenet values in a variable by group

Everyone, thanks for sharing your knowledge.

I can't thank you enough. It helped a lot. Smiley Happy

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 256 views
  • 2 likes
  • 5 in conversation