BookmarkSubscribeRSS Feed
deega
Quartz | Level 8

Hi All ,

I have following dummy dataset.

NumberAmountSequence
1111100A
1111200A
1111300C
1111100B
1112200A
1112100C
1112200A
1113300C
1113300A
1114100A
1114100B
1114200B
1114200C
1115300C
1116100A
1116200B
1116300C

 

I want to delete all the rows with  sequence=C but I want to keep its amount, so I am adding that amount to next observation of same group. In the cases where the row with sequence=C is last in the group then I want to add its amount to the preceding row.  Something like below:

NumberAmountSequence
1111100A
1111200A
1111400B
1112200A
1112300A
1113600A
1114100A
1114100B
1114400B
1116100A
1116500B
4 REPLIES 4
art297
Opal | Level 21

The following method could be used to accomplish what you're looking for:

 

data want (drop=_:);
  do until (last.Number);
    set have;
    by Number;
    if first.Number then _Ctotal=0;
    if Sequence eq 'C' then _Ctotal+Amount;
  end;
  do until (last.Number);
    set have (where=(Sequence ne 'C'));
    by Number;
    if last.Number then Amount+_Ctotal;
    output;
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

Ksharp
Super User
data have;
infile cards expandtabs truncover;
input Number	Amount	Sequence $;
n+1;
cards;
1111	100	A
1111	200	A
1111	300	C
1111	100	B
1112	200	A
1112	100	C
1112	200	A
1113	300	C
1113	300	A
1114	100	A
1114	100	B
1114	200	B
1114	200	C
1115	300	C
1116	100	A
1116	200	B
1116	300	C
;
run;
proc sort data=have;
 by descending n;
run;
data temp;
 set have;
 by number notsorted;
 if first.number or sequence ne 'C' then group+1;
run;
proc sort data=temp;
 by n;
run;
data temp1;
sum=0;
 do until(last.group);
  set temp;
  by group notsorted;
  sum+amount;
 end;
drop amount n group;
run;
data temp2;
 set temp1;
 by number notsorted;
 if first.number or sequence ne 'C' then group+1;
run;
data want;
 set temp2;
 by group notsorted;
 length seq $ 40;
 retain seq;
 if first.group then do;seq=sequence;amount=0;end;
 amount+sum;
 if last.group and seq ne 'C'  then output;
 drop sequence sum group;
run;

PGStats
Opal | Level 21

The following program can handle cases where sequence="C" can appear anywhere and be multiple :

 


data want;
length number amount 8 sequence $1;
carry = 0;
do until(last.number);
    set have(rename=(amount=a sequence=s)); by number;
    if s = "C" then carry = carry + a;
    else do;
        if not missing(sequence) then output;
        amount = a + carry;
        sequence = s;
        carry = 0;
        end;
    end;
if not missing(sequence) then do;
    amount = amount + carry;
    output;
    end;
keep number amount sequence; 
run;
PG
mkeintz
PROC Star

I think this program addresses number groups which have multiple C's scattered throughout.

 

Like Art's program it does a double "do until (last.number)".  In my case, the first do group examines how the number group ends.  If it ends with a non-C, then ending_ctotal=0.  Otherwise ending_ctotal gets the total amount from the run of C records that finish the number group.   And final_nonC is the relative position of the last non-C record.  This will be used in the second do group, to identify the record which will absorb the ending_ctotal amount.

 

data want (keep=number amount sequence);
  do N_Seq=1 by 1 until (last.number);
    set have;
    by number;
    if sequence^='C' then do;
      ending_ctotal=0;
      final_nonC=N_Seq;
    end;
    else ending_ctotal=sum(ending_ctotal,amount);
  end;

  do N=1 to N_Seq;
    set have;
    if sequence='C' then prior_ctotal=sum(prior_ctotal,amount);
    else do;
      amount=sum(amount,prior_ctotal);
      prior_ctotal=0;
      if N=final_nonC then amount=sum(amount,ending_ctotal);
      output;
    end;
  end;
run;  

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 994 views
  • 1 like
  • 5 in conversation