Hello,
I have a data set that has each instances of payments made by account_id.
Here is the sample data.
data test;
infile datalines;
input id $ payment date yymmdd10.;
format date yymmdd10.;
datalines;
1 20 2010-01-31
1 20 2010-02-28
1 40 2010-04-28
1 50 2010-07-02
2 40 2012-01-31
2 50 2012-02-28
2 10 2012-03-28
3 100 2013-02-20
3 200 2014-02-01
4 400 2011-01-31
5 50 2000-01-31
5 30 2001-02-28
6 10 2001-02-02
6 10 2001-03-03
6 10 2001-04-02
6 10 2001-05-03
;
run;
Now I need to keep the rows for each id, until the payment(cumulative payment) is greater than 70. So the final output will look like as follows.
1 20 2010-01-31
1 20 2010-02-28
1 40 2010-04-28
2 40 2012-01-31
2 50 2012-02-28
3 100 2013-02-20
4 400 2011-01-31
5 50 2000-01-31
5 30 2001-02-28
6 10 2001-02-02
6 10 2001-03-03
6 10 2001-04-02
6 10 2001-05-03
Thank you for the help in advance.
PS
I think a very slight modification of Peter's code will suffice. I.e.,
data want (drop=cum);
do until ( last.id ) ;
set test;
by id ;
if cum <= 70 then output ;
Cum + payment ;
end ;
Cum=0;
run ;
data try ;
do until ( last.id ) ;
set your.data ;
by id ;
Cum + payment ;
if cum <= 70 then output ;
End ;
Cum=0;
run ;
Thanks Peter for the suggestion.
There is a small issue here. I need to keep the rows until the cumulative amount for each id is OVER 70.
Further more, since id 3 and 4 are already over 70 in the first payment, this code does not output them. But I need to capture those as well.
Please let me know if you have suggestions on modifying this code. Thanks again.
Here is the output of your code.
id payment date cum
1 20 2010-01-31 20
1 20 2010-02-28 40
2 40 2012-01-31 40
5 50 2000-01-31 50
6 10 2001-02-02 10
6 10 2001-03-03 20
6 10 2001-04-02 30
6 10 2001-05-03 40
I think a very slight modification of Peter's code will suffice. I.e.,
data want (drop=cum);
do until ( last.id ) ;
set test;
by id ;
if cum <= 70 then output ;
Cum + payment ;
end ;
Cum=0;
run ;
thank you Art
(now why didn't I do that ! 😉
Hello Art;
This seems like dow loop. Could you explain your code and in which conditions dow loops are useful?
Thanks.
The code was only an extremely slight variant of Peter's code (i.e., I simply juxtaposed two lines) but, yes, the code used a DOW loop.
In this particular case, a DOW loop wasn't necessary, but didn't hurt anything either. The same solution, without a DOW loop, might have looked like:
data want (drop=cum);
set test;
by id ;
if first.id then cum=0;
if cum <= 70 then output ;
Cum + payment ;
run ;
Often, using a DOW loop is simply a matter of one's preference. IMHO, though, it is a good technique to become familiar with as it can let you do numerous calculations without having to write multiple data steps since multiple DOW loops can be nested together within one datastep.
Hello Art,
I always have trouble understanding when "if then output" statement precedes the other sas statements.
data want ;
set test;
by id ;
if first.id then cum=0;
Cum + payment ;
if last.id then output;
run ;
This is easy to understand. Where I need to put ' putlog _all_' or put _all_ ' statements to see what is going on each implicit loop (one iteration) of data step? Thanks for your input.
: You REALLY should start a new thread to ask such questions. However, in this case, I'll try to answer anyhow.
In another thread, today, I proposed a DOW loop solution. The placement and wording of putlog statements that help me to understand what the code is doing is totally dependent upon the code. Consider (and run), for example:
data have;
infile cards dlm='09'x;
informat firstdate lastdate anydtdte9.;
format firstdate lastdate date9.;
informat visit $15.;
input SUBJID VISITNUM VISIT FIRSTDATE LASTDATE;
cards;
1 20 BASELINE (RAW) 13-May-10 14-Jun-10
1 30 MONTH 1 15-Jun-10 12-Aug-10
1 40 MONTH 3 13-Aug-10 11-Nov-10
1 50 MONTH 6 12-Nov-10 12-May-11
1 70 MONTH 12 13-May-11 10-Nov-11
1 90 MONTH 18 11-Nov-11 8-May-12
1 110 MONTH 24 9-May-12 13-Nov-12
1 130 MONTH 30 14-Nov-12 21-May-13
1 150 MONTH 36 22-May-13 1-Aug-13
2 20 BASELINE (RAW) 13-May-10 14-Jun-10
2 30 MONTH 1 15-Jun-10 12-Aug-10
2 40 MONTH 3 13-Aug-10 11-Nov-10
2 50 MONTH 6 12-Nov-10 12-May-11
2 70 MONTH 12 13-May-11 10-Nov-11
2 90 MONTH 18 11-Nov-11 8-May-12
2 110 MONTH 24 9-May-12 13-Nov-12
2 130 MONTH 30 14-Nov-12 21-May-13
2 150 MONTH 36 22-May-13 1-Aug-13
;
data need;
do until (last.subjid);
set have;
by subjid;
min_firstdate=min(min_firstdate,firstdate);
max_lastdate=max(max_lastdate,lastdate);
***;
putlog "first dow loop" _all_;
***;
end;
format min_firstdate max_lastdate date9.;
do until (last.subjid);
set have;
by subjid;
***;
putlog "second dow loop" _all_;
***;
output;
end;
run;
Thank you Art and Peter for your help.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.