turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- delete row and sum the value of one of its column

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-02-2017 09:58 PM

Hi All ,

I have following dummy dataset.

Number | Amount | Sequence |

1111 | 100 | A |

1111 | 200 | A |

1111 | 300 | C |

1111 | 100 | B |

1112 | 200 | A |

1112 | 100 | C |

1112 | 200 | A |

1113 | 300 | C |

1113 | 300 | A |

1114 | 100 | A |

1114 | 100 | B |

1114 | 200 | B |

1114 | 200 | C |

1115 | 300 | C |

1116 | 100 | A |

1116 | 200 | B |

1116 | 300 | C |

I want to delete all the rows with sequence=C but I want to keep its amount, so I am adding that amount to next observation of same group. In the cases where the row with sequence=C is last in the group then I want to add its amount to the preceding row. Something like below:

Number | Amount | Sequence |

1111 | 100 | A |

1111 | 200 | A |

1111 | 400 | B |

1112 | 200 | A |

1112 | 300 | A |

1113 | 600 | A |

1114 | 100 | A |

1114 | 100 | B |

1114 | 400 | B |

1116 | 100 | A |

1116 | 500 | B |

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deega

03-02-2017 10:35 PM

The following method could be used to accomplish what you're looking for:

data want (drop=_:); do until (last.Number); set have; by Number; if first.Number then _Ctotal=0; if Sequence eq 'C' then _Ctotal+Amount; end; do until (last.Number); set have (where=(Sequence ne 'C')); by Number; if last.Number then Amount+_Ctotal; output; end; run;

HTH,

Art, CEO, AnalystFinder.com

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deega

03-02-2017 11:49 PM

```
data have;
infile cards expandtabs truncover;
input Number Amount Sequence $;
n+1;
cards;
1111 100 A
1111 200 A
1111 300 C
1111 100 B
1112 200 A
1112 100 C
1112 200 A
1113 300 C
1113 300 A
1114 100 A
1114 100 B
1114 200 B
1114 200 C
1115 300 C
1116 100 A
1116 200 B
1116 300 C
;
run;
proc sort data=have;
by descending n;
run;
data temp;
set have;
by number notsorted;
if first.number or sequence ne 'C' then group+1;
run;
proc sort data=temp;
by n;
run;
data temp1;
sum=0;
do until(last.group);
set temp;
by group notsorted;
sum+amount;
end;
drop amount n group;
run;
data temp2;
set temp1;
by number notsorted;
if first.number or sequence ne 'C' then group+1;
run;
data want;
set temp2;
by group notsorted;
length seq $ 40;
retain seq;
if first.group then do;seq=sequence;amount=0;end;
amount+sum;
if last.group and seq ne 'C' then output;
drop sequence sum group;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deega

03-02-2017 11:50 PM

The following program can handle cases where *sequence*="C" can appear anywhere and be multiple :

```
data want;
length number amount 8 sequence $1;
carry = 0;
do until(last.number);
set have(rename=(amount=a sequence=s)); by number;
if s = "C" then carry = carry + a;
else do;
if not missing(sequence) then output;
amount = a + carry;
sequence = s;
carry = 0;
end;
end;
if not missing(sequence) then do;
amount = amount + carry;
output;
end;
keep number amount sequence;
run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deega

03-03-2017 01:12 AM

I think this program addresses number groups which have multiple C's scattered throughout.

Like Art's program it does a double "do until (last.number)". In my case, the first do group examines how the number group ends. If it ends with a non-C, then **ending_ctotal**=0. Otherwise ending_ctotal gets the total amount from the run of C records that finish the number group. And **final_nonC** is the relative position of the last non-C record. This will be used in the second do group, to identify the record which will absorb the ending_ctotal amount.

```
data want (keep=number amount sequence);
do N_Seq=1 by 1 until (last.number);
set have;
by number;
if sequence^='C' then do;
ending_ctotal=0;
final_nonC=N_Seq;
end;
else ending_ctotal=sum(ending_ctotal,amount);
end;
do N=1 to N_Seq;
set have;
if sequence='C' then prior_ctotal=sum(prior_ctotal,amount);
else do;
amount=sum(amount,prior_ctotal);
prior_ctotal=0;
if N=final_nonC then amount=sum(amount,ending_ctotal);
output;
end;
end;
run;
```