Hi,
Need help to determine the first date when the interest rate changed for an account, The output must have the first date when the interest rate changed, account number and the corresponding interest rate. Below is a dummy data.
Date | Account Number | InterestRate |
1-Sep-18 | 30001234 | 4.29% |
2-Sep-18 | 30001234 | 4.29% |
3-Sep-18 | 30001234 | 4.29% |
4-Sep-18 | 30001234 | 4.29% |
5-Sep-18 | 30001234 | 4.29% |
6-Sep-18 | 30001234 | 4.29% |
7-Sep-18 | 30001234 | 4.29% |
8-Sep-18 | 30001234 | 4.29% |
9-Sep-18 | 30001234 | 4.29% |
10-Sep-18 | 30001234 | 5.00% |
11-Sep-18 | 30001234 | 5.00% |
12-Sep-18 | 30001234 | 5.00% |
13-Sep-18 | 30001234 | 5.00% |
14-Sep-18 | 30001234 | 5.00% |
1-Sep-18 | 30005678 | 3.60% |
2-Sep-18 | 30005678 | 3.60% |
3-Sep-18 | 30005678 | 3.60% |
4-Sep-18 | 30005678 | 3.60% |
5-Sep-18 | 30005678 | 3.60% |
6-Sep-18 | 30005678 | 3.60% |
7-Sep-18 | 30005678 | 3.60% |
8-Sep-18 | 30005678 | 3.60% |
9-Sep-18 | 30005678 | 3.60% |
10-Sep-18 | 30005678 | 4.30% |
11-Sep-18 | 30005678 | 4.30% |
12-Sep-18 | 30005678 | 4.30% |
13-Sep-18 | 30005678 | 4.80% |
14-Sep-18 | 30005678 | 4.80% |
Value your help.
Kind regards,
Mags
Something like this?
data have;
input Date :date11. AccountNumber InterestRate :percent.;
format Date date11. InterestRate percent8.2;
datalines;
1-Sep-18 30001234 4.29%
2-Sep-18 30001234 4.29%
3-Sep-18 30001234 4.29%
4-Sep-18 30001234 4.29%
5-Sep-18 30001234 4.29%
6-Sep-18 30001234 4.29%
7-Sep-18 30001234 4.29%
8-Sep-18 30001234 4.29%
9-Sep-18 30001234 4.29%
10-Sep-18 30001234 5.00%
11-Sep-18 30001234 5.00%
12-Sep-18 30001234 5.00%
13-Sep-18 30001234 5.00%
14-Sep-18 30001234 5.00%
1-Sep-18 30005678 3.60%
2-Sep-18 30005678 3.60%
3-Sep-18 30005678 3.60%
4-Sep-18 30005678 3.60%
5-Sep-18 30005678 3.60%
6-Sep-18 30005678 3.60%
7-Sep-18 30005678 3.60%
8-Sep-18 30005678 3.60%
9-Sep-18 30005678 3.60%
10-Sep-18 30005678 4.30%
11-Sep-18 30005678 4.30%
12-Sep-18 30005678 4.30%
13-Sep-18 30005678 4.80%
14-Sep-18 30005678 4.80%
;
data want;
set have;
by AccountNumber;
lagRate=lag1(InterestRate);
if first.AccountNumber then lagRate=.;
if lagRate ne InterestRate & lagRate ne .;
run;
@Timbim My reply is very very much similar to @Astounding + I have used the data have as typed by @PeterClemmensen
Even though independently arrived at the solution, the idea is very much similar to the post already written by Astounding. I have assumed that the first occurrence of interest rate also qualifies for a change. in interest-rate
The code is as below.
DATA Want(DROP=FlagRaised);
SET have;
RETAIN FlagRaised 0;
BY AccountNumber InterestRate;
IF FIRST.AccountNumber THEN FlagRaised = 1;
IF LAST.AccountNumber THEN FlagRaised = 0;
IF FIRST.InterestRate AND FlagRaised = 1 THEN OUTPUT;
RUN;
PROC PRINT DATA=Want;
RUN;
Not sure what you think that code is doing but it looks like it is just doing this logic:
data want;
set have;
by AccountNumber InterestRate;
if FIRST.InterestRate AND not last.accountnumber THEN OUTPUT;
RUN;
Not sure I have ever seen a problem where that is the right answer.
Yes Tom .. both the codes are identical in terms of output ...However, my code is not as concise as you. Thank you.
Why do you want to exclude the last group of an account if it happens to only consist of one record?
@Tom It is nice of you to bring this up.
Please allow me to digress a bit (it kept me thinking for a bit last night). If there is only one record, does SAS treat it as a FIRST instance or treat it as a LAST instance? I mean if I check for FIRST.variable and LAST.variable for a single record, which logical check will become true? Perhaps a simple check in SAS will answer that but then I moved onto the next topic.
So I don't know how SAS will treat a single group instance (as FIRST or as LAST)!!
Why do you want to exclude the last group of an account if it happens to only consist of one record?
I certainly do not wish to exclude that. I missed to take into account that condition.
Yes Thank you Tom! It makes sense.
I was typing/testing the two conditions as well (something I should have done last night).
Thank you again.
So now it sounds like you just want to do this:
data want;
set have;
by AccountNumber InterestRate;
if FIRST.InterestRate THEN OUTPUT;
RUN;
So the first observation for each value of INTERESTRATE within the current value of ACCOUNTNUMBER is kept.
Your code (below) is different
data want;
set have;
by AccountNumber InterestRate;
if FIRST.InterestRate THEN OUTPUT;
RUN;
Here the check for the AccountNumber is gone and InterestRate will be picked up, no matter what the account number is. As a special case where one account number has only one entry then in such a case, there should not be any rate change (because it was a single entry). For example let us assume I add one line in the dataset have as below.
data have;
input Date :date11. AccountNumber InterestRate :percent.;
format Date date11. InterestRate percent8.2;
datalines;
1-Sep-18 30001233 4.29%
1-Sep-18 30001234 4.29%
2-Sep-18 30001234 4.29%
3-Sep-18 30001234 4.29%
4-Sep-18 30001234 4.29%
5-Sep-18 30001234 4.29%
6-Sep-18 30001234 4.29%
7-Sep-18 30001234 4.29%
8-Sep-18 30001234 4.29%
9-Sep-18 30001234 4.29%
10-Sep-18 30001234 5.00%
11-Sep-18 30001234 5.00%
12-Sep-18 30001234 5.00%
13-Sep-18 30001234 5.00%
14-Sep-18 30001234 5.00%
1-Sep-18 30005678 3.60%
2-Sep-18 30005678 3.60%
3-Sep-18 30005678 3.60%
4-Sep-18 30005678 3.60%
5-Sep-18 30005678 3.60%
6-Sep-18 30005678 3.60%
7-Sep-18 30005678 3.60%
8-Sep-18 30005678 3.60%
9-Sep-18 30005678 3.60%
10-Sep-18 30005678 4.30%
11-Sep-18 30005678 4.30%
12-Sep-18 30005678 4.30%
13-Sep-18 30005678 4.80%
14-Sep-18 30005678 4.80%
;
RUN;
Now the code that I wrote will not pick the first line but the code that you mentioned will pick the first example.
I am pasting the output of your code followed by my code.
The FlagChecked (=1) is just a variable to remember that first AccountNumber has been encountered but not the last AccountNumber has been encountered yet. If last AccountNumber has been encountered it will be set to 0.
At this point I have no idea what the purpose of the exercise is. If you don't want to keep the first RATE for each account then don't.
data want;
set have;
by AccountNumber InterestRate;
if FIRST.InterestRate and not first.accountnumber THEN OUTPUT;
RUN;
If you don't want to keep the last RATE for each account then do it this way.
data want;
set have;
by AccountNumber InterestRate;
if last.InterestRate and not last.accountnumber THEN OUTPUT;
RUN;
Which you want probably depends on how the values of the other variables differ between the first and last for a given rate.
Note that this will eliminate accounts that only have one rate.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.