## How to cancel out observations that sum up to 0?

Occasional Contributor
Posts: 7

# How to cancel out observations that sum up to 0?

HI,

For days now I've been trying to clean up my data set.

Example:

OBS      Number      amount         count

1           1                45                  1

2           1                -45                 2

3           1                45                  3

4           2                 30                1

5           2                  25               2

6           2               -30                 3

7           2                25                 4

...

I was thinking of doing a do loop to start from first.number to last.number to exclude the observations that sum up to zero (obs 2, 3). Sometimes the negative amount is a sum of two previous/following numbers. I also tried deduping by squaring the amount and concatenating it with the number, doing count. However it doesn't work in all cases (obs 5,7 would be wrongly excluded too). So my required data set once cleaned would be:

OBS      Number      amount

1           1                45

5           2                  25

7           2                25

Alex

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

There are lots of logic you need to consider about .

### Code: Program

`data have;input Number   amount ;cards;     1 45    1 -45    1 45    2 30    2 25    2 -30    2 25  ;run;data temp1 temp2; set have;if amount lt 0 then do; amount=abs(amount); output temp2;end;else output temp1;run;proc sort data=temp1;by Number amount;run;proc sort data=temp2;by Number amount;run;data want;ina=0;inb=0;merge temp1(in=ina) temp2(in=inb);by Number amount;if not ina then amount=-amount;if not ina or not inb;run;`

Xia Keshan

Occasional Contributor
Posts: 7

## Re: How to cancel out observations that sum up to 0?

Hi Xia Keshan,

I do like your approach, however it doesn't work in all cases (I probably should have given different/extra examples). On few occasions I have a partial refund (in this case I'd like the refund to be deleted, the amount of the refund taken away from the closest positive amount within the number), refund which is a sum of previous payments and sum of two refunds is an amount. Oh and where I say refund I mean negative amount. Easier to explain on the below example. Is there a way of amending your code so it considers all below cases? (thanks again!!!)

I don't need to keep obs, it is here only for explanation purposes.

OBS      Number      amount         count

1           1                45                  1

2           1                -45                 2

3           1                45                  3

4           2                 30                1

5           2                  25               2

6           2               -30                 3

7           2                25                 4

8          3               50                    1

9          3               60                    2

10         3               -17                  3

11          4               100               1

12          4               100               2

13          4               100               3

14          4               -200               4

15          5               80               1

16          5               40               2

17          5               -20               3

18              5          -20               4

And I'd like the results to be:

OBS      Number      amount

1           1                45

5           2                  25

7           2                25

8          3               50                    1

9          3               43                    2      /************************************ partial refund 60-17=43************************/

11          4               100               1       /****************************** obs 12,13 cancel out obs 14***************************/

15          5               80               1        /******************************obs 16 cancel out obs 17 and 18***************************/

;

Thanks for you help!

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

### Code: Program

`data have;input Number   amount count ;cards;     1 45 1    1 -45 2  1 45 3  2 30 1  2 25 2    2 -30 3  2 25 4  3 50 1  3 60 2  3 -17 3  4 100 1  4 100 2  4 100 3  4 -200 4  5 80 1  5 40 2  5 -20 3  5 -20 4;run;data temp1 temp2; set have;if amount lt 0 then do; amount=abs(amount); output temp2;end;else output temp1;run;proc sort data=temp1;by Number amount;run;proc sort data=temp2;by Number amount;run;data x;ina=0;inb=0;merge temp1(in=ina) temp2(in=inb);by Number amount;if not ina then amount=-amount;if not ina or not inb;run;proc sort data=x;by Number descending count;run;data want; set x; by number; retain found; if first.number then do;sum=0;found=0;end; sum+amount; if not found and sum gt 0 then do;amount=sum;found=1; end; if sum gt 0; drop sum found;run;proc sort data=want;by number count;run;`
Occasional Contributor
Posts: 7

## Re: How to cancel out observations that sum up to 0?

Hi Xia Keshan,

Thanks for your help! Nearly there Although this code works for majority of cases, it still doesn't work for all of them. Especially when there are other transactions within 'number' following the refund. I was testing my data on your code (below). Also, sometimes when it does work and the 'sum' should be zero, there appears to be observation with 1.421085E-14. Is that due to formatting?

Also for number 7:

7 840.42 1

7 14.54 2

7 753.65 3

7 31.6 4

7 -768.19 5

Is there a way of it checking (within a number) for sum = 0, so I'd want

7 840.42 1

7 31.6 4

left (as 14.54+753.65-768.19=0). But if it's not going to work for every single case, then don't worry.

Thanks again!

data have;

input Number   amount count ;

cards;

1 45 1

1 -45 2

1 45 3

2 30 1

2 25 2

2 -30 3

2 25 4

3 50 1

3 60 2

3 -17 3

4 100 1

4 100 2

4 100 3

4 -200 4

5 80 1

5 35.9 2

5 188.66 3

5 -224.56 4

5 20 5

6 57.57 1

6 28.38 2

6 57.57 3

6 -85.95 4

6 28.38 5

7 840.42 1

7 14.54 2

7 753.65 3

7 31.6 4

7 -768.19 5

8 116.39 1

8 82.09 2

8 116.39 3

8 -198.48 4

8 82.09 5

8 116.39 6

8 -198.48 7

;

run;

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

Why it is

14.54+753.65-768.19=0

NOT

31.6+753.65-768.19=0

?

Occasional Contributor
Posts: 7

## Re: How to cancel out observations that sum up to 0?

Because it doesn't sum up to 0.

31.6+753.65-768.19=17.06 (not 0)

There must have been another payment before the refund

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

So what is your output ?

Output can't be ?

7 840.42 1

7 14.54 2

7 17.06 3

Do you want pick up the most likely value for that one ?

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

NO, You can't do that, unless you can enumerate all the combination of positive value .

Occasional Contributor
Posts: 7

## Re: How to cancel out observations that sum up to 0?

what I mean is, where there is a negative value (refund) I want that obs to cancel out with others. In case of obs 7

7 840.42 1

7 14.54 2

7 753.65 3

7 31.6 4

7 -768.19 5

I'd want to have this left:

7 840.42 1

7 31.6 4

But that's not as important. The code you gave me previously works perfect, apart from when there are other obs withing that number. Any ideas?

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

Idea is enumerate all the combination of position value to see if the sum of combination could refund the negative.

But That is too complicated .

First , check the one value to see if it could refund negative value.

Second, check the all the combination of two to see if its sum could refund

Third, check the all the combination of three to see if its sum could refund

...............

Do you really want this ?

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

And if all these combination could not refund to zero , what are you going to do ? Negative value minus WHAT ?

Occasional Contributor
Posts: 7

## Re: How to cancel out observations that sum up to 0?

If it wouldn't refund to zero then there'd be partial refund. That's why I though to do a loop within a number.

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

OBS      Number      amount         count

1           1                95                  1

2           1                -45                 2

3           1                45                  3

You want this , right ? You don't need to consider the order ?

OBS      Number      amount         count

1           1                95                  1

Super User
Posts: 10,784

## Re: How to cancel out observations that sum up to 0?

You want this ? You really want this ??

### Code: Program.sas

`data have;input Number   amount count ;cards;     1 45 1   1 -45 2  1 45 3  2 30 1  2 25 2   2 -30 3  2 25 4  3 50 1  3 60 2  3 -17 3  4 100 1  4 100 2  4 100 3  4 -200 4  5 80 1  5 35.9 2  5 188.66 3  5 -224.56 4  5 20 5  6 57.57 1  6 28.38 2  6 57.57 3  6 -85.95 4  6 28.38 5  7 840.42 1  7 14.54 2  7 753.65 3  7 31.6 4  7 -768.19 5  8 116.39 1  8 82.09 2  8 116.39 3  8 -198.48 4  8 82.09 5  8 116.39 6  8 -198.48 7;run;proc sort data=have;by Number descending amount;run;proc sql;select max(n) into : n from (select count(*) as n from have where amount ge 0 group by Number );quit;data temp;if _n_ eq 1 then do; declare hash ha(ordered:'a'); declare hiter hi('ha'); ha.definekey('k'); ha.definedata('Number','amount','k'); ha.definedone();end;set have;by Number;array v{&n};array value{&n} _temporary_;array key{&n} _temporary_;if first.Number then do;  n=0;  ha.clear();   call missing(of value{*},of key{*}); end;if amount ge 0 then do;  n+1;k=n;  ha.add();  value{n}=amount;  key{n}=n;end; else do;   _k=-1;   do i=1 to 2**dim(v);   rc=graycode(_k, of v{*});   sum=0;   do j=1 to dim(v);     sum+v{j}*value{j};   end;   dif=round(sum+amount,0.01);   if Number=8 then put sum= amount= dif= v{*} value{1} value{2} value{3} value{4};   if dif=0 then do;      do k=1 to dim(v);   if v{k}=1 then ha.remove(key:key{k});   end;   found=1;   leave;   end;      end;   if not found then output;   n=0;   do while(hi.next()=0);   n+1;   value{n}=amount;   key{n}=k;   end; end; if last.Number then do; do while(hi.next()=0);  output; end;end;keep Number amount;run;data want; set temp; by number; retain found; if first.number then do;sum=0;found=0;end; sum+amount; if not found and sum gt 0 then do;amount=sum;found=1; end; if sum gt 0; drop sum found;run;`

Xia Keshan

Discussion stats
• 22 replies
• 726 views
• 2 likes
• 5 in conversation