Re: percentile allocation of observation

Ashraya · Posted 06-18-2015 10:56 AM

I have two datasets;

1. Policy dataset where it has Policy_No and its Branch_code details;

2. Employee dataset where it has Branch_code, Employee_ID and Percentage field;

data Policy;

input Policy_No Branch $;

datalines;

101 A001

102 A001

103 A001

104 A001

105 A001

106 A001

107 A001

108 A001

109 A001

110 A001

111 A001

112 A001

;run;

data Employee;

input Branch $ Emp_ID Percent;

datalines;

A001 1111 30

A001 2222 50

;run;

I would like to create a output where 30% of policy details (any 4 policy_no out of total 12 policies) should be allocated to emp_ID 1111 and 50%(any 6 policy_no out of total 12 polices) of policy allocated to emp_ID 2222

desired output will be like this:

Branch Emp_ID Policy_no

A001 1111 101

A001 1111 102

A001 1111 103

A001 1111 104

A001 2222 105

A001 2222 106

A001 2222 107

A001 2222 108

A001 2222 109

A001 2222 110

Regards

Ash

PGStats · Posted 06-18-2015 12:40 PM

One way to do this is :

data Policy;

input Policy_No Branch $;

datalines;

101 A001

102 A001

103 A001

104 A001

105 A001

106 A001

107 A001

108 A001

109 A001

110 A001

111 A001

112 A001

;

data Employee;

input Branch $ Emp_ID Percent;

datalines;

A001 1111 30

A001 2222 50

;

proc sort data=policy; by branch policy_no; run;

proc sort data=employee; by branch emp_id; run;

data polOrd;

set policy; by branch;

if first.branch then ord = 0;

ord + 1;

run;

data empCum;

set employee; by branch;

if first.branch then fromPct = 0;

output;

fromPct + percent;

run;

proc sql;

create table want as

select p.branch, emp_id, policy_no

from empCum as e inner join

polOrd as p on e.branch=p.branch

group by p.branch, emp_id

having

ord > round(fromPct/100*count(ord)) and

ord <= round((fromPct+percent)/100*count(ord))

order by branch, emp_id, policy_no;

quit;

PG

Ashraya · Posted 06-18-2015 02:46 PM

Hi PGStats,

Thanks for replay, it worked,

I wonder is any other method or by using macro can we achieve the result.

Ash

PGStats · Posted 06-18-2015 04:09 PM

I imagine an alternate solution could be based on proc surveyselect. I can't however think of any use for macro processing to solve this problem.

PG

Ashraya · Posted 06-18-2015 09:58 PM

The above solution by PGStats will allocate sequence of policy no. What if it require to allocate the random policy_no instead of a series allocation.

Ash

PGStats · Posted 06-19-2015 11:57 AM

Simply randomize the policy_no order before the selection:

data Policy;

input Policy_No Branch $;

datalines;

101 A001

102 A001

103 A001

104 A001

105 A001

106 A001

107 A001

108 A001

109 A001

110 A001

111 A001

112 A001

;

data Employee;

input Branch $ Emp_ID Percent;

datalines;

A001 1111 30

A001 2222 50

;

data polOrd;

set policy;

ord = rand("UNIFORM");

run;

proc sort data=polOrd; by branch ord; run;

proc sort data=employee; by branch emp_id; run;

proc rank data=polOrd out=polOrd;

by branch;

var ord;

run;

data empCum;

set employee; by branch;

if first.branch then fromPct = 0;

output;

fromPct + percent;

run;

proc sql;

create table want as

select p.branch, emp_id, policy_no

from empCum as e inner join

polOrd as p on e.branch=p.branch

group by p.branch, emp_id

having

ord > round(fromPct/100*count(ord)) and

ord <= round((fromPct+percent)/100*count(ord))

order by branch, emp_id, policy_no;

quit;

PG

percentile allocation of observation