## Flag random data based on criteria

Solved
Frequent Contributor
Posts: 123

# Flag random data based on criteria

[ Edited ]

hello,

I need to transform data for the purpose of evaluating the actions that employees have taken on customers' account.

It would be greatly appreciated if someone could point me in the right direction.

I'm looking at previous month's data, which captures on a given day, any time an agent "actions" / modifies a customer's account. The action taken on customer's account are categorized by risk level, from 1 (most critical) to 3 less, and 0, no risk.

What I need to do:

Flag 5 observations per agent of which 3 have a risk_level=1, and 2 have a risk_level=2. If cannot find total of 5 obs, then select from risk_level=3, and if still can't find, select risk_level=0. (The goal is to flag 5 where 3 have a risk level of 1, and two  a lower risk level)

Ultimately, it would be great if I could export the data to excel, and have a pivot table for agent_name, and variable Subject_to_review, where each agent name would have a count 5.

Example data has been added below.

Thanks

Accepted Solutions
Solution
‎03-22-2017 02:33 PM
Super Contributor
Posts: 268

## Re: Flag random data based on criteria

[ Edited ]

How does this work for you? It is extensible: throw as much data at it as you like, and it should still work.

``````data have;
infile cards dsd dlm=',' firstobs=2 missover;
attrib agent_name length=\$ 20;
attrib action_type length=\$ 4;
attrib risk_level subject_to_review length=\$ 1;
input agent_name
action_type
risk_level;
select(risk_level);
when('1', '2', '3') recode_risk_level = risk_level;
when('0') recode_risk_level = '4';
otherwise recode_risk_level = '5';
end;
random = ranuni(225465114);  /* My 30-year-old UK phone number as seed */
id = monotonic();    /* Create a unique incremented ID per observation */
cards;
Agent_name,action_type,risk_level,subject_to_review
Albert,1000,1,Y
Albert,1000,1,Y
Albert,1000,1,
Albert,1000,1,
Albert,2100,2,Y
Albert,2100,2,Y
Albert,2100,2,Y
Albert,2100,2,
Albert,3230,0,
Albert,2500,3,
Albert,2500,3,
Albert,2500,3,
Albert,2500,3,
Albert,2500,3,
Albert,5200,,
Barb,1000,1,Y
Barb,1000,1,Y
Barb,2100,2,Y
Barb,2100,2,Y
Barb,2100,2,Y
Barb,3230,0,
Cathy,1000,1,Y
Cathy,2100,2,Y
Cathy,2100,2,Y
Cathy,3230,0,Y
Cathy,5200,,Y
;
run;

proc sort data=have noequals;
by agent_name recode_risk_level random;
run;

data subject_to_review;
set have;
by agent_name recode_risk_level;
retain count output_count;
if first.agent_name then
output_count = 1;
if first.recode_risk_level
then count = 1;
else count + 1;
if (   output_count > 5
or recode_risk_level = '1' and count > 3
or recode_risk_level = '2' and count > 2) then
delete;
output_count + 1;
keep id;
run;

proc sql undo_policy=none magic=102;
create table want as
select have.agent_name,
have.action_type,
have.risk_level,
ifc(str.id is not null, 'Y', ' ') as subject_to_review length=1
from have
left join subject_to_review as str
on have.id = str.id
order by have.id;
quit;``````

All Replies
Super Contributor
Posts: 268

## Re: Flag random data based on criteria

It's possible, but very difficult, to provide a suggestion without having some test data to work on. Can you give us some?

Frequent Contributor
Posts: 123

## Re: Flag random data based on criteria

[ Edited ]

Thanks LaurieF for ur reply.  At it's simplest, I need to randomly select 5 observations per agent_name, that should follow this pattern:

- if can find 3 observations of risk_level=1, flag subject_to_review=Y

-if can 2 observations of risk_level=2, flag subject_to_review=Y

This would arrive at 5 observations per agent_name. it becomes more complicated if cannot find 3 observations, as then need to pick up a lower risk level to arrive at your five per agent (no hard rule, but should first fetch higher risk level.

From higher to lower is : 1,2,3,0,'')

See below example, but without it being randomly assigned Y.  If no one has a lead to suggest, I may go with attempting to achieve this with excel vba. Thank you

 DATA_WANT Agent_name action_type risk_level subject_to_review Albert 1000 1 Y Albert 1000 1 Y Albert 1000 1 Albert 1000 1 Albert 2100 2 Y Albert 2100 2 Y Albert 2100 2 Y Albert 2100 2 Albert 3230 0 Albert 2500 3 Albert 2500 3 Albert 2500 3 Albert 2500 3 Albert 2500 3 Albert 5200 Barb 1000 1 Y Barb 1000 1 Y Barb 2100 2 Y Barb 2100 2 Y Barb 2100 2 Y Barb 3230 0 Cathy 1000 1 Y Cathy 2100 2 Y Cathy 2100 2 Y Cathy 3230 0 Y Cathy 5200 Y
Solution
‎03-22-2017 02:33 PM
Super Contributor
Posts: 268

## Re: Flag random data based on criteria

[ Edited ]

How does this work for you? It is extensible: throw as much data at it as you like, and it should still work.

``````data have;
infile cards dsd dlm=',' firstobs=2 missover;
attrib agent_name length=\$ 20;
attrib action_type length=\$ 4;
attrib risk_level subject_to_review length=\$ 1;
input agent_name
action_type
risk_level;
select(risk_level);
when('1', '2', '3') recode_risk_level = risk_level;
when('0') recode_risk_level = '4';
otherwise recode_risk_level = '5';
end;
random = ranuni(225465114);  /* My 30-year-old UK phone number as seed */
id = monotonic();    /* Create a unique incremented ID per observation */
cards;
Agent_name,action_type,risk_level,subject_to_review
Albert,1000,1,Y
Albert,1000,1,Y
Albert,1000,1,
Albert,1000,1,
Albert,2100,2,Y
Albert,2100,2,Y
Albert,2100,2,Y
Albert,2100,2,
Albert,3230,0,
Albert,2500,3,
Albert,2500,3,
Albert,2500,3,
Albert,2500,3,
Albert,2500,3,
Albert,5200,,
Barb,1000,1,Y
Barb,1000,1,Y
Barb,2100,2,Y
Barb,2100,2,Y
Barb,2100,2,Y
Barb,3230,0,
Cathy,1000,1,Y
Cathy,2100,2,Y
Cathy,2100,2,Y
Cathy,3230,0,Y
Cathy,5200,,Y
;
run;

proc sort data=have noequals;
by agent_name recode_risk_level random;
run;

data subject_to_review;
set have;
by agent_name recode_risk_level;
retain count output_count;
if first.agent_name then
output_count = 1;
if first.recode_risk_level
then count = 1;
else count + 1;
if (   output_count > 5
or recode_risk_level = '1' and count > 3
or recode_risk_level = '2' and count > 2) then
delete;
output_count + 1;
keep id;
run;

proc sql undo_policy=none magic=102;
create table want as
select have.agent_name,
have.action_type,
have.risk_level,
ifc(str.id is not null, 'Y', ' ') as subject_to_review length=1
from have
left join subject_to_review as str
on have.id = str.id
order by have.id;
quit;``````

Frequent Contributor
Posts: 123

## Re: Flag random data based on criteria

thank you very much Laurie for your suggestion and for some of the explanations provided.
Super Contributor
Posts: 268

## Re: Flag random data based on criteria

If there's anything else you need to know (there's some interesting stuff in the sql join for example), sing out - I'm happy to explain.

Frequent Contributor
Posts: 123

## Re: Flag random data based on criteria

hi Laurie,

Your suggestion is very useful, thanks again. My data will have numerous instances where a given agent will not have 3 observations with risk level 1 or even 2 obs with risk level 2. The challenge i believe is arriving at a program/code to assess that if higher risk (R) does not sum to 3 for a given agent, will need higher observations from lower risk level.

In the want table, row 19, under agent Barb, would be looking for it to populate to 'Y', as R has only 2 obs, so need 1 more, and prefer higher risk level.

 agent_name action_type risk_level subject_to_review Albert 1000 1 Albert 1000 1 Y Albert 1000 1 Y Albert 1000 1 Y Albert 2100 2 Albert 2100 2 Y Albert 2100 2 Y Albert 2100 2 Albert 3230 0 Albert 2500 3 Albert 2500 3 Albert 2500 3 Albert 2500 3 Albert 2500 3 Albert 5200 Barb 1000 1 Y Barb 1000 1 Y Barb 2100 2 Barb 2100 2 Y Barb 2100 2 Y Barb 3230 0 Y

Is there a way for the program you offered, to say, based on risk (R) level, for it to consider, for each given agent:

if R1*3>=3, do nothing; if R1*3=2 then fetch an additional lower level R2 ... Ultimately to arrive at the following 5 obs, per risk level:

R1+R1+R1+R2+R2 (Ideal scenario for a given agent)
R1+R1+R2+R2+R2
R1+R2+R2+R2+R2

If full shortage of R1, compensate by getting from lower level, etc.

R2+R2+R2+R2+R2
R2+R2+R2+R2+R3
R2+R2+R2+R3+R3
...
R5+R5+R5+R5+R5