Hi All ,
I have following dummy dataset.
Number | Amount | Sequence |
1111 | 100 | A |
1111 | 200 | A |
1111 | 300 | C |
1111 | 100 | B |
1112 | 200 | A |
1112 | 100 | C |
1112 | 200 | A |
1113 | 300 | C |
1113 | 300 | A |
1114 | 100 | A |
1114 | 100 | B |
1114 | 200 | B |
1114 | 200 | C |
1115 | 300 | C |
1116 | 100 | A |
1116 | 200 | B |
1116 | 300 | C |
I want to delete all the rows with sequence=C but I want to keep its amount, so I am adding that amount to next observation of same group. In the cases where the row with sequence=C is last in the group then I want to add its amount to the preceding row. Something like below:
Number | Amount | Sequence |
1111 | 100 | A |
1111 | 200 | A |
1111 | 400 | B |
1112 | 200 | A |
1112 | 300 | A |
1113 | 600 | A |
1114 | 100 | A |
1114 | 100 | B |
1114 | 400 | B |
1116 | 100 | A |
1116 | 500 | B |
The following method could be used to accomplish what you're looking for:
data want (drop=_:); do until (last.Number); set have; by Number; if first.Number then _Ctotal=0; if Sequence eq 'C' then _Ctotal+Amount; end; do until (last.Number); set have (where=(Sequence ne 'C')); by Number; if last.Number then Amount+_Ctotal; output; end; run;
HTH,
Art, CEO, AnalystFinder.com
data have;
infile cards expandtabs truncover;
input Number Amount Sequence $;
n+1;
cards;
1111 100 A
1111 200 A
1111 300 C
1111 100 B
1112 200 A
1112 100 C
1112 200 A
1113 300 C
1113 300 A
1114 100 A
1114 100 B
1114 200 B
1114 200 C
1115 300 C
1116 100 A
1116 200 B
1116 300 C
;
run;
proc sort data=have;
by descending n;
run;
data temp;
set have;
by number notsorted;
if first.number or sequence ne 'C' then group+1;
run;
proc sort data=temp;
by n;
run;
data temp1;
sum=0;
do until(last.group);
set temp;
by group notsorted;
sum+amount;
end;
drop amount n group;
run;
data temp2;
set temp1;
by number notsorted;
if first.number or sequence ne 'C' then group+1;
run;
data want;
set temp2;
by group notsorted;
length seq $ 40;
retain seq;
if first.group then do;seq=sequence;amount=0;end;
amount+sum;
if last.group and seq ne 'C' then output;
drop sequence sum group;
run;
The following program can handle cases where sequence="C" can appear anywhere and be multiple :
data want;
length number amount 8 sequence $1;
carry = 0;
do until(last.number);
set have(rename=(amount=a sequence=s)); by number;
if s = "C" then carry = carry + a;
else do;
if not missing(sequence) then output;
amount = a + carry;
sequence = s;
carry = 0;
end;
end;
if not missing(sequence) then do;
amount = amount + carry;
output;
end;
keep number amount sequence;
run;
I think this program addresses number groups which have multiple C's scattered throughout.
Like Art's program it does a double "do until (last.number)". In my case, the first do group examines how the number group ends. If it ends with a non-C, then ending_ctotal=0. Otherwise ending_ctotal gets the total amount from the run of C records that finish the number group. And final_nonC is the relative position of the last non-C record. This will be used in the second do group, to identify the record which will absorb the ending_ctotal amount.
data want (keep=number amount sequence);
do N_Seq=1 by 1 until (last.number);
set have;
by number;
if sequence^='C' then do;
ending_ctotal=0;
final_nonC=N_Seq;
end;
else ending_ctotal=sum(ending_ctotal,amount);
end;
do N=1 to N_Seq;
set have;
if sequence='C' then prior_ctotal=sum(prior_ctotal,amount);
else do;
amount=sum(amount,prior_ctotal);
prior_ctotal=0;
if N=final_nonC then amount=sum(amount,ending_ctotal);
output;
end;
end;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.