DATA Step, Macro, Functions and more

Real time Problem how to aggregate on two variables based on the third variable value and get output

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Real time Problem how to aggregate on two variables based on the third variable value and get output

Hello Everyone, I facing a real Time problem as below. I have modified the data accordingly. Below is the data i have. Its transactional data.

Here Trn_code 'C' represent credit and 'D'  represent Debit.

Trn_key  trn_CODE   Sender   Receiver     Amount      sender_in_receiver_list    Receiver_in_Sender_list

1234        C                   John       Louis        10000                   Y                             N
2233        D                    David      John        20000                   N                            Y
1111        D                     Louis      David      10000                   N                            Y
2345        C                   Thomas     John       30000                   Y                           N
3456        D                    Daniel       Louis       25000                   Y                           Y
4567        C                    John       Thomas     50000                   Y                           Y

 

 

Now I need to get only observations where

    1) (sender_in_receiver_list = 'Y' or     Receiver_in_Sender_list = 'Y' )    AND
    2) The aggregate amount received by receiver is greater than or equal to Minimum Aggregate Daily Amount (Threshold A= 5000)  AND
    3) The aggregate amount sent from sender must be greater than or equal to a Minimum Percentage [Threshold B = 10%] of the amount in Condition 2.    AND
    4) The aggregate amount sent from sender must be less than or equal to a Maximum Percentage [Threshold C = 40% ] of the amount in Condition 2.

 /**/

if we take example of John
 He sent money to Louis and Thomas ( 10000+50000 = 60000) and recived from David and thomas (20000+30000 =  50000)
  condition 1 satisfy as Jhon occurs in both send and receiver list.
  condition 2 satisfy as jhon has received a total of 50000 which is >= 5000.
  condition 3 satisfy as jhon has sent 60000 which is >= (10% of 50000).
  condition 4 is satisfied as jhon has sent 60000 which is <= (120% of 50000)

I should get Johns in my output.

 

I will appreciate help on this. Thanks in advance !! :-)


Accepted Solutions
Solution
‎02-06-2018 03:58 AM
Super User
Posts: 10,849

Re: Real time Problem how to aggregate on two variables based on the third variable value and get ou

data have;
input Trn_key  trn_CODE $  Sender  $ Receiver   $  Amount      sender_in_receiver_list $   Receiver_in_Sender_list $;
cards;
1234        C                   John       Louis        10000                   Y                             N
2233        D                    David      John        20000                   N                            Y
1111        D                     Louis      David      10000                   N                            Y
2345        C                   Thomas     John       30000                   Y                           N
3456        D                    Daniel       Louis       25000                   Y                           Y
4567        C                    John       Thomas     50000                   Y                           Y
;
run;
data k;
 set have;
 k=Sender;if sender_in_receiver_list='Y' then output;
 k=Receiver;if Receiver_in_Sender_list='Y' then output;
 keep k;
run;
proc sort data=k nodupkey;by k;run;
data sender;
 set have;
 k=Sender;
 keep k amount;
run;
data receiver;
 set have;
 k=Receiver;v=Receiver_in_Sender_list;
 keep k amount;
run;
data want;
 if _n_=1 then do;
  if 0 then set sender;
  declare hash s(dataset:'sender',multidata:'y');
  s.definekey('k');
  s.definedata('amount');
  s.definedone();
  
  if 0 then set receiver;
  declare hash r(dataset:'receiver',multidata:'y');
  r.definekey('k');
  r.definedata('amount');
  r.definedone();
 end;
set k;
rc=r.find();
amount_receive=0;
do while(rc=0);
 amount_receive+amount;
 rc=r.find_next();
end;
condition2=ifn(amount_receive>=5000,1,0);

rc=s.find();
amount_send=0;
do while(rc=0);
 amount_send+amount;
 rc=s.find_next();
end;
condition3=ifn(amount_send>=amount_receive*0.1,1,0);
condition4=ifn(amount_send<=amount_receive*1.2,1,0);

if condition2 and condition3 and condition4 then output;
keep k;
run;
 

View solution in original post


All Replies
Regular Contributor
Posts: 226

Re: Real time Problem how to aggregate on two variables based on the third variable value and get ou

Interesting problem. Hash objects could be part of a solution or a really horrible proc sql. Can't work on it right now,sorry.
Solution
‎02-06-2018 03:58 AM
Super User
Posts: 10,849

Re: Real time Problem how to aggregate on two variables based on the third variable value and get ou

data have;
input Trn_key  trn_CODE $  Sender  $ Receiver   $  Amount      sender_in_receiver_list $   Receiver_in_Sender_list $;
cards;
1234        C                   John       Louis        10000                   Y                             N
2233        D                    David      John        20000                   N                            Y
1111        D                     Louis      David      10000                   N                            Y
2345        C                   Thomas     John       30000                   Y                           N
3456        D                    Daniel       Louis       25000                   Y                           Y
4567        C                    John       Thomas     50000                   Y                           Y
;
run;
data k;
 set have;
 k=Sender;if sender_in_receiver_list='Y' then output;
 k=Receiver;if Receiver_in_Sender_list='Y' then output;
 keep k;
run;
proc sort data=k nodupkey;by k;run;
data sender;
 set have;
 k=Sender;
 keep k amount;
run;
data receiver;
 set have;
 k=Receiver;v=Receiver_in_Sender_list;
 keep k amount;
run;
data want;
 if _n_=1 then do;
  if 0 then set sender;
  declare hash s(dataset:'sender',multidata:'y');
  s.definekey('k');
  s.definedata('amount');
  s.definedone();
  
  if 0 then set receiver;
  declare hash r(dataset:'receiver',multidata:'y');
  r.definekey('k');
  r.definedata('amount');
  r.definedone();
 end;
set k;
rc=r.find();
amount_receive=0;
do while(rc=0);
 amount_receive+amount;
 rc=r.find_next();
end;
condition2=ifn(amount_receive>=5000,1,0);

rc=s.find();
amount_send=0;
do while(rc=0);
 amount_send+amount;
 rc=s.find_next();
end;
condition3=ifn(amount_send>=amount_receive*0.1,1,0);
condition4=ifn(amount_send<=amount_receive*1.2,1,0);

if condition2 and condition3 and condition4 then output;
keep k;
run;
 
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 135 views
  • 0 likes
  • 3 in conversation