Solved: Re: Calculating ratio for differenet values in a variable by group

Sejin · Posted 03-18-2018 09:52 AM

Hello.

I have a problem in calculation ratio for diffierent values in a variable by group.

For example, I have a dataset like this.

id	date	name	var1
1	161231	a	x
1	161231	a	y
1	161231	b	y
1	161231	c	x
1	161231	c	y
1	161231	d	z
2	161231	e	x
2	161231	e	y
2	161231	f	x
2	161231	f	z
2	161231	g	z

I want to calculate (number of a value in var1)/(number of distinct names) by id-date groups.

For example, for id 1 - date 161231 group, I want x_ratio=2/4 y_ratio=3/4 z_ratio=1/4.

Also, for id 2 - date 161231 group, I want x_ratio=2/3 y_ratio=1/3 z_ratio=2/3.

If you have any suggestion, please share it with me.

Thank you.

JohnHoughton · Posted 03-18-2018 12:51 PM

One option, plus the code for transpose in case you want the ratios in separate columns

proc sql;
create table want as
select a.id,a.date,var1,nvalInVar1 ,nDistinctNames,nvalInVar1/nDistinctNames as Ratio
 from (select id,date,var1,count (*) as nvalInVar1 from have group by id,date,var1) a
join (select id,date,count(distinct(name)) as nDistinctNames from have group by id,date) b
on a.id=b.id and a.date=b.date;

proc transpose data=want out=want2 (drop=_name_) prefix=ratio_; 
by id date; 
id var1 ;
var ratio;
 run;

View solution in original post

PaigeMiller · Posted 03-18-2018 12:13 PM

I can't understand how when id=1, z_ratio=1/4 from your data. You need to explain in more detail how these ratios are computed.

--
Paige Miller

Astounding · Posted 03-18-2018 12:41 PM

@PaigeMiller,

I don't have time to solve this right now, but I can explain where the numbers come from.

The denominator is 4 because there are 4 distinct NAME values for the combination of ID and DATE.

The numerator is 1 because there is only 1 observation for the combination of ID, DATE, and VAR1.

JohnHoughton · Posted 03-18-2018 12:51 PM

One option, plus the code for transpose in case you want the ratios in separate columns

proc sql;
create table want as
select a.id,a.date,var1,nvalInVar1 ,nDistinctNames,nvalInVar1/nDistinctNames as Ratio
 from (select id,date,var1,count (*) as nvalInVar1 from have group by id,date,var1) a
join (select id,date,count(distinct(name)) as nDistinctNames from have group by id,date) b
on a.id=b.id and a.date=b.date;

proc transpose data=want out=want2 (drop=_name_) prefix=ratio_; 
by id date; 
id var1 ;
var ratio;
 run;

novinosrin · Posted 03-18-2018 03:36 PM

data have;
input id date	name $	var1 $;
datalines;
1	161231	a	x
1	161231	a	y
1	161231	b	y
1	161231	c	x
1	161231	c	y
1	161231	d	z
2	161231	e	x
2	161231	e	y
2	161231	f	x
2	161231	f	z
2	161231	g	z
;
proc freq data=have  noprint ;
by id date;
tables var1/out=temp(drop=percent) ;
run; 


proc sql;
create table want as
select a.*,b.cnt, a.count/b.cnt as ratio
from 
temp a left join (select id, date,count(distinct name) as cnt from have group by id, date) b
on a.id=b.id and a.date=b.date;
quit;

Sejin · Posted 03-18-2018 11:03 PM

Everyone, thanks for sharing your knowledge.

I can't thank you enough. It helped a lot. 🙂

Calculating ratio for differenet values in a variable by group

Re: Calculating ratio for differenet values in a variable by group

Re: Calculating ratio for differenet values in a variable by group

Re: Calculating ratio for differenet values in a variable by group

Re: Calculating ratio for differenet values in a variable by group

Re: Calculating ratio for differenet values in a variable by group

Re: Calculating ratio for differenet values in a variable by group

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away