Hi.. I have a dataset that organized by customer level and has a variable called 'balance'. I want to check, for each customer, if their balance has ever gone above 500; if it has, how many times.
I thought about using array, but not sure how I can use it in this case. Does anyone have any suggestion?
| customer | date | balance |
| A | 201804 | 200 |
| A | 201805 | 600 |
| A | 201806 | 450 |
| A | 201807 | 1000 |
| B | 201712 | 0 |
| B | 201801 | 50 |
| B | 201802 | 100 |
| customer | date | balance | ever500 | over500 |
| A | 201804 | 200 | 0 | 0 |
| A | 201805 | 600 | 1 | 1 |
| A | 201806 | 450 | 1 | 1 |
| A | 201807 | 1000 | 1 | 2 |
| B | 201712 | 0 | 0 | 0 |
| B | 201801 | 50 | 0 | 0 |
| B | 201802 | 100 | 0 | 0 |
@newboy1218 wrote:
And how can I retain the value of ever500 too?
Ups, missed that one.
data have;
input customer $ date $ balance;
datalines;
A 201804 200
A 201805 600
A 201806 450
A 201807 1000
B 201712 0
B 201801 50
B 201802 100
;
data want;
set have;
by customer;
retain ever500 over500;
if first.customer
then do;
ever500 = 0;
over500 = 0;
end;
if balance > 500 then ever500 = 1;
over500 + (balance > 500);
run;
proc print data=want noobs;
run;
Result now looks exactly like what you want:
customer date balance ever500 over500 A 201804 200 0 0 A 201805 600 1 1 A 201806 450 1 1 A 201807 1000 1 2 B 201712 0 0 0 B 201801 50 0 0 B 201802 100 0 0
Note how I presented the example data in a data step with datalines, so that the dataset can be created with a simple copy/paste and run. Please do so yourself in the future.
data want;
set have;
by customer;
retain over500;
if first.customer then over500 = 0;
ever500 = (balance > 500);
over500 + ever500;
run;
@newboy1218 wrote:
Thank you. So do I need to sort the dataset by customer and date first?
Yes.
And how can I retain the value of ever500 too?
@newboy1218 wrote:
And how can I retain the value of ever500 too?
Ups, missed that one.
data have;
input customer $ date $ balance;
datalines;
A 201804 200
A 201805 600
A 201806 450
A 201807 1000
B 201712 0
B 201801 50
B 201802 100
;
data want;
set have;
by customer;
retain ever500 over500;
if first.customer
then do;
ever500 = 0;
over500 = 0;
end;
if balance > 500 then ever500 = 1;
over500 + (balance > 500);
run;
proc print data=want noobs;
run;
Result now looks exactly like what you want:
customer date balance ever500 over500 A 201804 200 0 0 A 201805 600 1 1 A 201806 450 1 1 A 201807 1000 1 2 B 201712 0 0 0 B 201801 50 0 0 B 201802 100 0 0
Note how I presented the example data in a data step with datalines, so that the dataset can be created with a simple copy/paste and run. Please do so yourself in the future.
The way you want your output data set shaped, you need a bit of logic to ensure that:
For example (assuming that HAVE is sorted by customer):
data have ;
input customer :$1. date balance ;
cards ;
A 201804 200
A 201805 600
A 201806 450
A 201807 1000
B 201712 0
B 201801 50
B 201802 100
run ;
data want ;
do until (last.customer) ;
set have ;
by customer ;
if not ever500 then ever500 = 0 ;
if balance > 500 then ever500 = 1 ;
over500 = sum (over500, balance > 500) ;
output ;
end ;
run ;
It helps that the DATA step auto-resets EVER500 and OVER500 to missing values at the top of the implied loop, i.e. before the DO loop starts iterating.
Kind regards
Paul D.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.