Obsidian | Level 7

## Calculating ratio for differenet values in a variable by group

Hello.

I have a problem in calculation ratio for diffierent values in a variable by group.

For example, I have a dataset like this.

 id date name var1 1 161231 a x 1 161231 a y 1 161231 b y 1 161231 c x 1 161231 c y 1 161231 d z 2 161231 e x 2 161231 e y 2 161231 f x 2 161231 f z 2 161231 g z

I want to calculate (number of a value in var1)/(number of distinct names) by id-date groups.

For example, for id 1 - date 161231 group, I want x_ratio=2/4 y_ratio=3/4 z_ratio=1/4.

Also, for id 2 - date 161231 group, I want x_ratio=2/3 y_ratio=1/3 z_ratio=2/3.

If you have any suggestion, please share it with me.

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Quartz | Level 8

## Re: Calculating ratio for differenet values in a variable by group

One option, plus the code for transpose in case you want the ratios in separate columns

``````proc sql;
create table want as
select a.id,a.date,var1,nvalInVar1 ,nDistinctNames,nvalInVar1/nDistinctNames as Ratio
from (select id,date,var1,count (*) as nvalInVar1 from have group by id,date,var1) a
join (select id,date,count(distinct(name)) as nDistinctNames from have group by id,date) b
on a.id=b.id and a.date=b.date;

proc transpose data=want out=want2 (drop=_name_) prefix=ratio_;
by id date;
id var1 ;
var ratio;
run;``` ```

5 REPLIES 5
Diamond | Level 26

## Re: Calculating ratio for differenet values in a variable by group

I can't understand how when id=1, z_ratio=1/4 from your data. You need to explain in more detail how these ratios are computed.

--
Paige Miller
PROC Star

## Re: Calculating ratio for differenet values in a variable by group

I don't have time to solve this right now, but I can explain where the numbers come from.

The denominator is 4 because there are 4 distinct NAME values for the combination of ID and DATE.

The numerator is 1 because there is only 1 observation for the combination of ID, DATE, and VAR1.

Quartz | Level 8

## Re: Calculating ratio for differenet values in a variable by group

One option, plus the code for transpose in case you want the ratios in separate columns

``````proc sql;
create table want as
select a.id,a.date,var1,nvalInVar1 ,nDistinctNames,nvalInVar1/nDistinctNames as Ratio
from (select id,date,var1,count (*) as nvalInVar1 from have group by id,date,var1) a
join (select id,date,count(distinct(name)) as nDistinctNames from have group by id,date) b
on a.id=b.id and a.date=b.date;

proc transpose data=want out=want2 (drop=_name_) prefix=ratio_;
by id date;
id var1 ;
var ratio;
run;``` ```

Tourmaline | Level 20

## Re: Calculating ratio for differenet values in a variable by group

``````data have;
input id date	name \$	var1 \$;
datalines;
1	161231	a	x
1	161231	a	y
1	161231	b	y
1	161231	c	x
1	161231	c	y
1	161231	d	z
2	161231	e	x
2	161231	e	y
2	161231	f	x
2	161231	f	z
2	161231	g	z
;
proc freq data=have  noprint ;
by id date;
tables var1/out=temp(drop=percent) ;
run;

proc sql;
create table want as
select a.*,b.cnt, a.count/b.cnt as ratio
from
temp a left join (select id, date,count(distinct name) as cnt from have group by id, date) b
on a.id=b.id and a.date=b.date;
quit;``````
Obsidian | Level 7

## Re: Calculating ratio for differenet values in a variable by group

Everyone, thanks for sharing your knowledge.

I can't thank you enough. It helped a lot. 🙂

Discussion stats
• 5 replies
• 3520 views
• 3 likes
• 5 in conversation