I have the following data set which is sorted by ID, Time and Default. The variable default can take two values 1 and 0 with 1 being default and 0 being non default.
data test;
infile datalines;
input ID Time$ Default;
datalines;
1 Jan13 0
1 Feb13 0
1 Mar13 1
2 Jan13 1
2 Feb13 0
2 Mar13 0
3 Jan13 1
3 Feb13 1
3 Mar13 0
3 Apr13 1
4 Jan13 0
4 Feb13 0
4 Mar13 0
The task that I am trying to achieve is to find the number of occurrences of default and the time difference between defaults by ID. The definition of default occurrence is when a customer goes in default, then pays back in time and then defaults again. For example for ID 1, there is one default event and the duration is 3 months. Similarly ID 2 and 3 have default events with duration of 3 and 4 months. However ID 4 has no default event.
I want to achieve this in SAS and have tried using RETAIN, FIRST. and LAST. option but can't get anywhere. If anyone can point me in the right direction it would be really useful.
Many thanks,
Please post what you've tried so far.
@sasuser0912 wrote:
I have the following data set which is sorted by ID, Time and Default. The variable default can take two values 1 and 0 with 1 being default and 0 being non default.
data test;
infile datalines;
input ID Time$ Default;
datalines;
1 Jan13 0
1 Feb13 0
1 Mar13 1
2 Jan13 1
2 Feb13 0
2 Mar13 0
3 Jan13 1
3 Feb13 1
3 Mar13 0
3 Apr13 1
4 Jan13 0
4 Feb13 0
4 Mar13 0
The task that I am trying to achieve is to find the number of occurrences of default and the time difference between defaults by ID. The definition of default occurrence is when a customer goes in default, then pays back in time and then defaults again. For example for ID 1, there is one default event and the duration is 3 months. Similarly ID 2 and 3 have default events with duration of 3 and 4 months. However ID 4 has no default event.
I want to achieve this in SAS and have tried using RETAIN, FIRST. and LAST. option but can't get anywhere. If anyone can point me in the right direction it would be really useful.
Many thanks,
This is not the final answer, but it might be a start for you. I'm sure that there are more elegant methods.
From your description, it sounds like you want to record the "number of months in-default, inclusive of the month when the bill is finally paid." This doesn't do that, but just keeps a running tally of the duration of the most recent default event. As @ballardw points out, you need native date values to calculated reliable durations -- we should not assume that there is an entry for every month for every ID. With a proper date value, we can use the INTCK function to compute the number of months between two records.
data test;
infile datalines;
length ID 8 time 8 default=8;
/* read MON YY values as numeric date values */
informat time monyy.;
format time monyy.;
input ID Time Default;
datalines;
1 Jan13 0
1 Feb13 0
1 Mar13 1
2 Jan13 1
2 Feb13 0
2 Mar13 0
3 Jan13 1
3 Feb13 1
3 Mar13 0
3 Apr13 1
4 Jan13 0
4 Feb13 0
4 Mar13 0
;
run;
data defaults;
set test;
length DefaultEventDuration 8 LastEvent 8;
format LastEvent monyy.;
by ID time;
retain DefaultEventCount 0 InDefault 0 LastEvent .;
/* init the state for this ID */
if first.ID then do;
DefaultEventDuration = 0;
DefaultEventCount = Default;
InDefault = Default;
LastEvent = ifn(Default,time, .);
end;
/* If ID is now in default but wasn't before, increment event count */
/* and retain event start time */
if Default and ^InDefault then do;
DefaultEventCount+1;
LastEvent = time;
end;
/* track whether In Default right now */
InDefault = Default;
/* If in Default, then keep running duration of months, including current month */
if InDefault then
DefaultEventDuration = intck('month',LastEvent, time) + 1;
else DefaultEventDuration = 0;
drop InDefault ;
run;
In future, it's helpful if you share what you tried so that we can have a better understanding of the business problem you're trying to solve.
Hi Chris,
Thank you so much for the detailed explanation. I have taken your code on board and doing some experimentation with it. Thank you so much for your help!!
Thanks to all for your responses. I have stumbled across another problem:
data test_new;
infile datalines;
length ID 8 time 8 default=8;
/* read MON YY values as numeric date values */
informat time monyy.;
format time monyy.;
input ID Time Default;
datalines;
1 Jan13 0
1 Feb13 0
1 Mar13 1
1 Apr13 1
1 May13 0
1 Jun13 0
1 Jul13 1
1 Aug13 1
1 Sep13 0
1 Oct13 1
1 Nov13 0
1 Dec13 0
1 Jan14 0
1 Feb14 1
1 Mar14 0
;
run;
I am trying to achieve one of the tasks as to find the time between defaults, that is, between April13 and Jul13 we have 2 non defaults and so the duration for non default is 2 months.
ID | time | Default | Time between defaults |
1 | Jan2013 | 0 | |
1 | Feb2013 | 0 | |
1 | Mar2013 | 1 | |
1 | Apr2013 | 1 | |
1 | May2013 | 0 | 2 months |
1 | Jun2013 | 0 | |
1 | Jul2013 | 1 | |
1 | Aug2013 | 1 | |
1 | Sep2013 | 0 | 1 month |
1 | Oct2013 | 1 | |
1 | Nov2013 | 0 | |
1 | Dec2013 | 0 | 3 months |
1 | Jan2014 | 0 | |
1 | Feb2014 | 1 | |
1 | Mar2014 | 0 |
I have tried the following code but not going anywhere with it. Thanks in advance!!
data time_new;
set test_new;
format start_date monyy.;
format end_date monyy.;
previous_default=lag(default);
/*previous_month=lag(Time);*/
if default=1 and previous_default =0 then start_date=Time;
else if default=0 and previous_default =1 then end_date=time;
if default=1 and previous_default=1 then start_date=Time;
else if default=0 and previous_default =0 then end_date=Time;
if default=1 and previous_default=1 then start_date=Time;
else if default=0 and previous_default =1 then end_date=Time;
run;
It is pretty hard to do anything around time difference with character date values.
Start by reading your date values as SAS dates. Use the monyy. informat. This will facilitate the use of the date functions.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.