Hi.. I have a dataset that organized by customer level and has a variable called 'balance'. I want to check, for each customer, if their balance has ever gone above 500; if it has, how many times.
I thought about using array, but not sure how I can use it in this case. Does anyone have any suggestion?
customer | date | balance |
A | 201804 | 200 |
A | 201805 | 600 |
A | 201806 | 450 |
A | 201807 | 1000 |
B | 201712 | 0 |
B | 201801 | 50 |
B | 201802 | 100 |
customer | date | balance | ever500 | over500 |
A | 201804 | 200 | 0 | 0 |
A | 201805 | 600 | 1 | 1 |
A | 201806 | 450 | 1 | 1 |
A | 201807 | 1000 | 1 | 2 |
B | 201712 | 0 | 0 | 0 |
B | 201801 | 50 | 0 | 0 |
B | 201802 | 100 | 0 | 0 |
@newboy1218 wrote:
And how can I retain the value of ever500 too?
Ups, missed that one.
data have;
input customer $ date $ balance;
datalines;
A 201804 200
A 201805 600
A 201806 450
A 201807 1000
B 201712 0
B 201801 50
B 201802 100
;
data want;
set have;
by customer;
retain ever500 over500;
if first.customer
then do;
ever500 = 0;
over500 = 0;
end;
if balance > 500 then ever500 = 1;
over500 + (balance > 500);
run;
proc print data=want noobs;
run;
Result now looks exactly like what you want:
customer date balance ever500 over500 A 201804 200 0 0 A 201805 600 1 1 A 201806 450 1 1 A 201807 1000 1 2 B 201712 0 0 0 B 201801 50 0 0 B 201802 100 0 0
Note how I presented the example data in a data step with datalines, so that the dataset can be created with a simple copy/paste and run. Please do so yourself in the future.
data want;
set have;
by customer;
retain over500;
if first.customer then over500 = 0;
ever500 = (balance > 500);
over500 + ever500;
run;
@newboy1218 wrote:
Thank you. So do I need to sort the dataset by customer and date first?
Yes.
And how can I retain the value of ever500 too?
@newboy1218 wrote:
And how can I retain the value of ever500 too?
Ups, missed that one.
data have;
input customer $ date $ balance;
datalines;
A 201804 200
A 201805 600
A 201806 450
A 201807 1000
B 201712 0
B 201801 50
B 201802 100
;
data want;
set have;
by customer;
retain ever500 over500;
if first.customer
then do;
ever500 = 0;
over500 = 0;
end;
if balance > 500 then ever500 = 1;
over500 + (balance > 500);
run;
proc print data=want noobs;
run;
Result now looks exactly like what you want:
customer date balance ever500 over500 A 201804 200 0 0 A 201805 600 1 1 A 201806 450 1 1 A 201807 1000 1 2 B 201712 0 0 0 B 201801 50 0 0 B 201802 100 0 0
Note how I presented the example data in a data step with datalines, so that the dataset can be created with a simple copy/paste and run. Please do so yourself in the future.
The way you want your output data set shaped, you need a bit of logic to ensure that:
For example (assuming that HAVE is sorted by customer):
data have ;
input customer :$1. date balance ;
cards ;
A 201804 200
A 201805 600
A 201806 450
A 201807 1000
B 201712 0
B 201801 50
B 201802 100
run ;
data want ;
do until (last.customer) ;
set have ;
by customer ;
if not ever500 then ever500 = 0 ;
if balance > 500 then ever500 = 1 ;
over500 = sum (over500, balance > 500) ;
output ;
end ;
run ;
It helps that the DATA step auto-resets EVER500 and OVER500 to missing values at the top of the implied loop, i.e. before the DO loop starts iterating.
Kind regards
Paul D.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.