Dear Sirs/Madams,
I have a dataset with default rates. I have IDs for different companies, time, exposures, etc. Over time the same company can default twice, say in 200703 and 200910. If the company defaults, I assign a default_flag=1. If not = 0. But I need to create such a dataset that if the same company defaults twice, I only assign default_flag=1 to the latest date, say 200910. How to do it?
This is the code that I wrote:
data data.downgrade1;
set data.default;
if BTM_Borrower_RTG >='083' & lag_RG <'083' & Exposure > 0 & lag_RG NE . then
default_flag=1;
run;
data data.downgrade2;
set data.downgrade1;
by Instrument_ID;
if last.Instrument_ID;
where default_flag=1;
run;
But in my code I can only see the companies with default_flag=1 in downgrade2 but I also want to see all other companies. How can I do it?
Thank you for your help.
Please try this untested code.
First we need to sort the defaults datasets by company and date. Then try the below code
data data.downgrade1;
set data.default;
by company date;
if first.company=0 and last.company=1 then default_flag=1;
run;
Thank you, Jag, for your reply. But if I also need to create something like this. I need to get the output highlihted in yellow. Let me explain how it works.
Assume we have companies. Some default (say, their ratings is 8 and then they never get better). So we set the default date when their ratings drop to 8 and then we don't care about any other observations (replace 0 with .). But if some other companies after default get better (say their ratings was 8 and then jumped up to 2 and then after a while back to 9 again) we want to report the latest default date 9 and set the dummy equal to 1 which corresponds to that date.
How will then my code look like if I currently have code that produces code in column D but I want to make column in E.
Dear Jag,
It seems that I have some problem with this code:
ERROR: BY variables are not properly sorted on data set DATA.DOWNGRADE1.
Dear Jag,
I tried to run the code but it does not produce what I tried to do. But it ran eventually when I sorted the variables. Let me try to explain again. In my dataset I assign default_flag=1 to the companies the have ratings higher than 8-3. If the company has a lower rating, then default_flag=0. But if the company defaulted (in 200703) but then got better but then defaulted again (200909) then I'd report only one default flag for 200909. So I want to report the latest default date when the default flag =1. How to do it? Thank you very much for your help.
Dear Sirs/Madams,
Assume we have companies. Some default (say, their ratings is 8 and then they never get better). So we set the default date when their ratings drop to 8 and then we don't care about any other observations (replace 0 with .). But if some other companies after default get better (say their ratings was 8 and then jumped up to 2 and then after a while back to 9 again) we want to report the latest default date 9 and set the dummy equal to 1 which corresponds to that date.
How will then my code look like if I currently have code that produces code in column D but I want to make column in E.
To get some good answers, post some test data, in the form of a datastep - the below topic will help - and what the output should look like. I am afraid I am not downloading any Excel files.
data have;
infile cards expandtabs truncover;
input TIME_ID ID RTG Default;
cards;
200501 1 1 .
200502 1 2 .
200503 1 8 1
200504 1 9 .
200501 2 1 .
200502 2 8 .
200503 2 2 .
200504 2 9 1
;
run;
data want;
set have;
by id;
retain found 0;
if first.id then found=0;
if not found then want=0;
if Default=1 then do;
found=1;
want=1;
end;
drop found;
run;
Dear Xia,
Your code is excellent. Thank you very much for your help. I tried to run it and it works but I have a very little nuance which I probably did not emphasize. assume I have one company that defaulted twice, in 200703 and 200910, so the default_flag=1 in both cases. Can we somehow have all 0 for 'want' variable priot to the latest default date (200910) and nothing after (.). So that for each company the variable 'want' will have zero up until 200909, one for 200910 and . after 200910. How will this code be modified?
Here's one way, find the desired date and merge back
data have;
infile cards expandtabs truncover;
input TIME_ID ID RTG Default;
cards;
200501 1 1 .
200502 1 2 .
200503 1 8 1
200504 1 9 .
200501 2 1 .
200502 2 8 1
200503 2 2 .
200504 2 9 1
;
run;
proc summary data=have nway;
where default=1;
class id;
var time_id;
output out=lastdef (drop=_:) max=;
run;
data want;
merge
have (drop=default)
lastdef (in=indef)
;
by id time_id;
if indef then default=1;
run;
The HAVE set needs to be sorted by id time_id if it actually isn't already.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.