DATA Step, Macro, Functions and more

How to create a custom aggregation

Reply
Occasional Contributor
Posts: 14

How to create a custom aggregation

[ Edited ]

I have a requirement like this and i need your help in solving this.

I have a dataset called test which is like below.

 

Here we have a column called P_Level and we have three values for this (P1, P2, and P3). SO my requirement is like this

 

  1. For P1 level- I should get the sum of all Observations which are having status as ‘Y’. So If I assume the new variable as SUM it should be the sum of (3+4,3+3,3+6) =22
  2. For P2 level- My SUM should be equal to the sum of P3 level- where my PL1 and PL2 variable values match and Status =Y.
  3. Please find the screenshot of input and output for your reference.

data test;
input P_Level$ PL1$ PL2$ PL3$ QTY1 QTY2 STATUS$;
cards;
P1 A . . 3 4 N
P2 A PL11 . 10 20 N
P3 A PL11 PL21 1 2 N
P3 A PL11 PL22 3 4 Y
P3 A PL11 PL23 1 3 N
P2 A PL12 . 30 40 N
P3 A PL12 PL21 1 3 N
P3 A PL12 PL22 3 3 Y
P3 A PL12 PL23 1 5 N
P3 A PL13 . 33 43 N
P3 A PL13 PL21 1 6 N
P3 A PL13 PL22 3 6 Y
P3 A PL13 PL23 1 6 N
;


Output.PNG
Respected Advisor
Posts: 3,893

Re: SAS Code fix

[ Edited ]

Below should work:

proc sql feedback;
  create table p1_sum as
    select pl1, sum(sum(qty1,qty2)) as p_sum
    from test
    where status='Y'
    group by pl1
    ;
  create table p2_sum as
    select pl2, sum(sum(qty1,qty2)) as p_sum
    from test
    where status='Y'
    group by pl2
    ;
  create table want as
    select t.*, coalesce(p1.p_sum, p2.p_sum) as p_sum
    from
      test t
      left join p1_sum p1
        on missing(t.pl2) and t.pl1=p1.pl1
      left join p2_sum p2
        on missing(t.pl3) and t.pl2=p2.pl2
    ;
quit;

 

or as a variation of above:

proc means data=test(where=(status='Y')) noprint;
  class pl1 pl2;
  ways 1 2;
  var qty1 qty2;
  output out=inter (where=(_type_ in (2,3))) sum=hugo1 hugo2;
run;quit;


proc sql feedback;
  create table want as
    select t.*, sum(p.hugo1,p.hugo2) as p_sum
    from
      test t
      left join inter p
        on t.pl1=p.pl1 and t.pl2=p.pl2 and missing(t.pl3)
    ;
quit;
Trusted Advisor
Posts: 1,381

Re: SAS Code fix

When I need to solve a compound or complicated situation, I prefer to split it into few parts.

 

Let start with:

 

data tmp1 tmp2 tmp3;

input P_Level  $   PL1  $ PL2 $ PL3 $   QTY1   QTY2   STATUS  $;

select (P_level);

    when ("P1") output tmp1;

    when ("P2") output tmp2;

    when ("P3") output tmp3;

end;
cards;

  .....  /* your input is here */

;

run;

 

Now you can deal with each P_level seperately:

 

data sum_P1;

  set tmp1 end=eof;

        retain sum 0;

         if status = 'Y' then sum = sum(of sum qty1 qty2);       

        if eof then output;

        keep P_level sum;

RUN;

 

For levels P2 and P3 I didn't understand what you need to do,

neither for P1 with status = 'N'.

 

 

Occasional Contributor
Posts: 14

Re: SAS Code fix

Thanks for the response shmuel.

Sorry for not describing my requirement clear.

1. For level P1, it should get the sum from all the below levels where ever status is Y (Irrespective of the levels), so in the given example there is a Y for P3 level, so it should pick the sum from all 'Y'.

2. For Level P2, it should pick the value from Level P3 and Status is 'Y'

 

Please see the ouput attached for your reference


Output.PNG
Super User
Posts: 17,840

Re: SAS Code fix

How do you know which records to include? 

 

Please detail ail your logic in as much detail as possible. 

Occasional Contributor
Posts: 14

Re: SAS Code fix

Hello Reeza,

 

1. For level P1, it should get the sum from all the below levels where ever status is Y (Irrespective of the levels), so in the given example there is a Y for P3 level, so it should pick the sum from all 'Y'.

2. For Level P2, it should pick the value from Level P3 and Status is 'Y'

 

Thanks

Trusted Advisor
Posts: 1,381

Re: SAS Code fix

What do mean by "below level"?  Are P2 and P3 below level P1 ?

 

For level P1 would you sum all QTY1 and QTY2 from all levels where status = "Y" ?

 

For level P2 is summing QTY1 and QTY2 from level P3 only where status = 'Y"  or

sum from both P2 and P3 levels ?

 

 

Occasional Contributor
Posts: 14

Re: SAS Code fix

Hi,

 

What do mean by "below level"?  Are P2 and P3 below level P1  ANs: Yes

 

For level P1 would you sum all QTY1 and QTY2 from all levels where status = "Y" ? ANs : Yes

 

For level P2 is summing QTY1 and QTY2 from level P3 only where status = 'Y"  or

sum from both P2 and P3 levels ? ANs: level P3 only where status = 'Y" .

Trusted Advisor
Posts: 1,381

Re: SAS Code fix

Concerning your answer and output you expected and uploaded,

You may try the code attached.

 

I assumed that every group of PL2 value is preceded by P_level = 'P2'

which seems to be not correct. You miss 'P2' preceding PL13.

I'm leaving this one to you to resolve using RETAIN previous PL2 and testing for change;

Attachment
Super User
Posts: 9,681

Re: How to create a custom aggregation

Assuming I understand what you mean:





data test;
input P_Level$ PL1$ PL2$ PL3$ QTY1 QTY2 STATUS$;
cards;
P1 A . . 3 4 N
P2 A PL11 . 10 20 N
P3 A PL11 PL21 1 2 N
P3 A PL11 PL22 3 4 Y
P3 A PL11 PL23 1 3 N
P2 A PL12 . 30 40 N
P3 A PL12 PL21 1 3 N
P3 A PL12 PL22 3 3 Y
P3 A PL12 PL23 1 5 N
P2 A PL13 . 33 43 N
P3 A PL13 PL21 1 6 N
P3 A PL13 PL22 3 6 Y
P3 A PL13 PL23 1 6 N
;
run;
data test;
 set test;
 if p_level in ('P1' 'P2') then group+1;
run;
proc sql;
create table want as
 select *,case 
 when(p_level='P1') then 
 (select sum(sum(qty1,qty2)) from test where status='Y')
 when(p_level='P2') then 
 (select sum(sum(qty1,qty2)) from test where group=a.group and status='Y')
 else . end as sum
  from test as a;
quit;

Ask a Question
Discussion stats
  • 9 replies
  • 411 views
  • 0 likes
  • 5 in conversation