## How to create a custom aggregation

Occasional Contributor
Posts: 14

# How to create a custom aggregation

[ Edited ]

I have a requirement like this and i need your help in solving this.

I have a dataset called test which is like below.

Here we have a column called P_Level and we have three values for this (P1, P2, and P3). SO my requirement is like this

1. For P1 level- I should get the sum of all Observations which are having status as ‘Y’. So If I assume the new variable as SUM it should be the sum of (3+4,3+3,3+6) =22
2. For P2 level- My SUM should be equal to the sum of P3 level- where my PL1 and PL2 variable values match and Status =Y.
3. Please find the screenshot of input and output for your reference.

data test;
input P_Level\$ PL1\$ PL2\$ PL3\$ QTY1 QTY2 STATUS\$;
cards;
P1 A . . 3 4 N
P2 A PL11 . 10 20 N
P3 A PL11 PL21 1 2 N
P3 A PL11 PL22 3 4 Y
P3 A PL11 PL23 1 3 N
P2 A PL12 . 30 40 N
P3 A PL12 PL21 1 3 N
P3 A PL12 PL22 3 3 Y
P3 A PL12 PL23 1 5 N
P3 A PL13 . 33 43 N
P3 A PL13 PL21 1 6 N
P3 A PL13 PL22 3 6 Y
P3 A PL13 PL23 1 6 N
;

Posts: 4,741

## Re: SAS Code fix

[ Edited ]

Below should work:

``````proc sql feedback;
create table p1_sum as
select pl1, sum(sum(qty1,qty2)) as p_sum
from test
where status='Y'
group by pl1
;
create table p2_sum as
select pl2, sum(sum(qty1,qty2)) as p_sum
from test
where status='Y'
group by pl2
;
create table want as
select t.*, coalesce(p1.p_sum, p2.p_sum) as p_sum
from
test t
left join p1_sum p1
on missing(t.pl2) and t.pl1=p1.pl1
left join p2_sum p2
on missing(t.pl3) and t.pl2=p2.pl2
;
quit;``````

or as a variation of above:

``````proc means data=test(where=(status='Y')) noprint;
class pl1 pl2;
ways 1 2;
var qty1 qty2;
output out=inter (where=(_type_ in (2,3))) sum=hugo1 hugo2;
run;quit;

proc sql feedback;
create table want as
select t.*, sum(p.hugo1,p.hugo2) as p_sum
from
test t
left join inter p
on t.pl1=p.pl1 and t.pl2=p.pl2 and missing(t.pl3)
;
quit;
``````
Posts: 1,837

## Re: SAS Code fix

When I need to solve a compound or complicated situation, I prefer to split it into few parts.

data tmp1 tmp2 tmp3;

input P_Level  \$   PL1  \$ PL2 \$ PL3 \$   QTY1   QTY2   STATUS  \$;

select (P_level);

when ("P1") output tmp1;

when ("P2") output tmp2;

when ("P3") output tmp3;

end;
cards;

.....  /* your input is here */

;

run;

Now you can deal with each P_level seperately:

data sum_P1;

set tmp1 end=eof;

retain sum 0;

if status = 'Y' then sum = sum(of sum qty1 qty2);

if eof then output;

keep P_level sum;

RUN;

For levels P2 and P3 I didn't understand what you need to do,

neither for P1 with status = 'N'.

Occasional Contributor
Posts: 14

## Re: SAS Code fix

Thanks for the response shmuel.

Sorry for not describing my requirement clear.

1. For level P1, it should get the sum from all the below levels where ever status is Y (Irrespective of the levels), so in the given example there is a Y for P3 level, so it should pick the sum from all 'Y'.

2. For Level P2, it should pick the value from Level P3 and Status is 'Y'

Super User
Posts: 23,776

## Re: SAS Code fix

How do you know which records to include?

Occasional Contributor
Posts: 14

## Re: SAS Code fix

Hello Reeza,

1. For level P1, it should get the sum from all the below levels where ever status is Y (Irrespective of the levels), so in the given example there is a Y for P3 level, so it should pick the sum from all 'Y'.

2. For Level P2, it should pick the value from Level P3 and Status is 'Y'

Thanks

Posts: 1,837

## Re: SAS Code fix

What do mean by "below level"?  Are P2 and P3 below level P1 ?

For level P1 would you sum all QTY1 and QTY2 from all levels where status = "Y" ?

For level P2 is summing QTY1 and QTY2 from level P3 only where status = 'Y"  or

sum from both P2 and P3 levels ?

Occasional Contributor
Posts: 14

## Re: SAS Code fix

Hi,

What do mean by "below level"?  Are P2 and P3 below level P1  ANs: Yes

For level P1 would you sum all QTY1 and QTY2 from all levels where status = "Y" ? ANs : Yes

For level P2 is summing QTY1 and QTY2 from level P3 only where status = 'Y"  or

sum from both P2 and P3 levels ? ANs: level P3 only where status = 'Y" .

Posts: 1,837

## Re: SAS Code fix

You may try the code attached.

I assumed that every group of PL2 value is preceded by P_level = 'P2'

which seems to be not correct. You miss 'P2' preceding PL13.

I'm leaving this one to you to resolve using RETAIN previous PL2 and testing for change;

Super User
Posts: 10,787

## Re: How to create a custom aggregation

```Assuming I understand what you mean:

data test;
input P_Level\$ PL1\$ PL2\$ PL3\$ QTY1 QTY2 STATUS\$;
cards;
P1 A . . 3 4 N
P2 A PL11 . 10 20 N
P3 A PL11 PL21 1 2 N
P3 A PL11 PL22 3 4 Y
P3 A PL11 PL23 1 3 N
P2 A PL12 . 30 40 N
P3 A PL12 PL21 1 3 N
P3 A PL12 PL22 3 3 Y
P3 A PL12 PL23 1 5 N
P2 A PL13 . 33 43 N
P3 A PL13 PL21 1 6 N
P3 A PL13 PL22 3 6 Y
P3 A PL13 PL23 1 6 N
;
run;
data test;
set test;
if p_level in ('P1' 'P2') then group+1;
run;
proc sql;
create table want as
select *,case
when(p_level='P1') then
(select sum(sum(qty1,qty2)) from test where status='Y')
when(p_level='P2') then
(select sum(sum(qty1,qty2)) from test where group=a.group and status='Y')
else . end as sum
from test as a;
quit;

```
Discussion stats
• 9 replies
• 456 views
• 0 likes
• 5 in conversation