BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Franci941
Calcite | Level 5
Hi! My dataset is like this one

ID DATE STATUS
1 31/01/2016 0
1 31/05/2016 1
1 28/02/2017 0
2 28/02/2018 1
2 31/05/2018 1
3 31/01/2016 1
3 31/03/2016 1
3 31/05/2016 0
4 31/01/2016 1
4 31/07/2016 0
4 28/02/2017 1
5 31/01/2016 0
5 31/05/2016 1
5 28/02/2017 1

You consider status=1 as default and status=0 'in bonis'.
For each ID, I have to calculate the days between:
- if it starts with default (status=1) and it continues always with default (like ID=2) -> calculate the days between the first observation date (28/02) and today
- if it starts with default (status=1) and it continues with in bonis (status=0) (like ID=3) -> calculate the days between the last default date (31/03) and in bonis date
- if it starts with bonis (status=0) and it continues with default (status=1) (like ID=5) -> calculate the days between the first default date (31/05) and today
- if it starts with default (status=1), it continues with in bonis (status=0) and it returns in default (like ID=4) -> calculate the days between the first default date (31/01) and in bonis date (31/07), the days between the Second default date (28/02) and today
- if it starts with in bonis (status=0) it continues with default (status=1) and it returns in bonis (like ID=1) -> calculate the days between the default date (31/05) and in bonis date

Thank you so much to everyone who answers me soon!
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Use by-group processing, and a retained variable for the beginning of a "default" period. Use the missing value as an indicator that no "default" period is active:

%let end_date='31dec2018'd;
/* equivalent to
%let end_date=%sysfunc(inputn(31/12/2018,ddmmyy10.));
*/

data have;
input id date:ddmmyy10. status;
format date ddmmyy10.;
datalines;
1 31/01/2016 0
1 31/05/2016 1
1 28/02/2017 0
2 28/02/2018 1
2 31/05/2018 1
3 31/01/2016 1
3 31/03/2016 1
3 31/05/2016 0
4 31/01/2016 1
4 31/07/2016 0
4 28/02/2017 1
5 31/01/2016 0
5 31/05/2016 1
5 28/02/2017 1
;
run;

data want;
set have;
by id;
retain begin_date;
format begin_date ddmmyy10.;
if first.id then begin_date = .; /* inititalize at beginning of group */
if status = 1 and begin_date = . then begin_date = date; /* detect start of default period */
if status = 0 and begin_date ne . /* end of a default period */
then do;
  duration = date - begin_date;
  output;
  begin_date = .;
end;
if last.id and status = 1 /* end of group with active or starting default period */
then do;
  duration = &end_date. - begin_date;
  date = &end_date.;
  output;
end;
keep id begin_date date duration;
run;

proc print data=want noobs;
run;

Take note of

  • the comments that descibe the conditions
  • the way example data is presented in a data step for easy creation of the dataset (copy/paste, submit)
  • use of a macro variable for end_date

Result:

id          date    begin_date    duration

 1    28/02/2017    31/05/2016       273  
 2    31/12/2018    28/02/2018       306  
 3    31/05/2016    31/01/2016       121  
 4    31/07/2016    31/01/2016       182  
 4    31/12/2018    28/02/2017       671  
 5    31/12/2018    31/05/2016       944  

 

View solution in original post

9 REPLIES 9
ballardw
Super User

@Franci941 wrote:
 1

You consider status=1 as default and status=0 'in bonis'.
For each ID, I have to calculate the days between:
- if it starts with default (status=1) and it continues always with default (like ID=2) -> calculate the days between the first observation date (28/02) and today
- if it starts with default (status=1) and it continues with in bonis (status=0) (like ID=3) -> calculate the days between the last default date (31/03) and in bonis date
- if it starts with bonis (status=0) and it continues with default (status=1) (like ID=5) -> calculate the days between the first default date (31/05) and today
- if it starts with default (status=1), it continues with in bonis (status=0) and it returns in default (like ID=4) -> calculate the days between the first default date (31/01) and in bonis date (31/07), the days between the Second default date (28/02) and today
- if it starts with in bonis (status=0) it continues with default (status=1) and it returns in bonis (like ID=1) -> calculate the days between the default date (31/05) and in bonis date

Thank you so much to everyone who answers me soon!

Today as in the day the code is run? or the last date in your data for the ID such as for id=2 31/05/2018 1?

 

Todays date sounds a bit odd since you have quite different ranges of values of likely dates for each Id. You likely should indicate for each of your examples which is 1) the actual date selected for comparison, 2) the date used to calculate the interval for that given date.

Franci941
Calcite | Level 5
It's only one example, the dates are different.. But for today, I think it's ok 31/12/2018
Kurt_Bremser
Super User

Use by-group processing, and a retained variable for the beginning of a "default" period. Use the missing value as an indicator that no "default" period is active:

%let end_date='31dec2018'd;
/* equivalent to
%let end_date=%sysfunc(inputn(31/12/2018,ddmmyy10.));
*/

data have;
input id date:ddmmyy10. status;
format date ddmmyy10.;
datalines;
1 31/01/2016 0
1 31/05/2016 1
1 28/02/2017 0
2 28/02/2018 1
2 31/05/2018 1
3 31/01/2016 1
3 31/03/2016 1
3 31/05/2016 0
4 31/01/2016 1
4 31/07/2016 0
4 28/02/2017 1
5 31/01/2016 0
5 31/05/2016 1
5 28/02/2017 1
;
run;

data want;
set have;
by id;
retain begin_date;
format begin_date ddmmyy10.;
if first.id then begin_date = .; /* inititalize at beginning of group */
if status = 1 and begin_date = . then begin_date = date; /* detect start of default period */
if status = 0 and begin_date ne . /* end of a default period */
then do;
  duration = date - begin_date;
  output;
  begin_date = .;
end;
if last.id and status = 1 /* end of group with active or starting default period */
then do;
  duration = &end_date. - begin_date;
  date = &end_date.;
  output;
end;
keep id begin_date date duration;
run;

proc print data=want noobs;
run;

Take note of

  • the comments that descibe the conditions
  • the way example data is presented in a data step for easy creation of the dataset (copy/paste, submit)
  • use of a macro variable for end_date

Result:

id          date    begin_date    duration

 1    28/02/2017    31/05/2016       273  
 2    31/12/2018    28/02/2018       306  
 3    31/05/2016    31/01/2016       121  
 4    31/07/2016    31/01/2016       182  
 4    31/12/2018    28/02/2017       671  
 5    31/12/2018    31/05/2016       944  

 

Franci941
Calcite | Level 5
Thank you so much! It runs correctly. Last question, if I have this situation?
ID STATUS
1 1
1. 1
1. 0
1. 0
1. 1
1. 1

The dates are not important; if I want to calculate (as possibility of the previous code)
1) days between first default and first bonis (like before)
2) days between first default and Second default
Kurt_Bremser
Super User

@Franci941 wrote:
Thank you so much! It runs correctly. Last question, if I have this situation?
ID STATUS
1 1
1. 1
1. 0
1. 0
1. 1
1. 1

The dates are not important; if I want to calculate (as possibility of the previous code)
1) days between first default and first bonis (like before)
2) days between first default and Second default

Add another retained variable, and check for the respective events:

data want;
set have;
by id;
retain begin_date previous_default;
format begin_date previous_default ddmmyy10.;
if first.id
then do; /* inititalize at beginning of group */
  begin_date = .;
  previous_default = .;
end;
if status = 1 and begin_date = .
then do; /* detect start of default period */
  if previous_default ne .
  then do;
    diff_default = date - previous_default;
    output;
  end;
  begin_date = date;
  previous_default = date;
end;
if status = 0 and begin_date ne . /* end of a default period */
then do;
  duration = date - begin_date;
  output;
  begin_date = .;
end;
if last.id and status = 1 /* end of group with active or starting default period */
then do;
  diff_default = .; /* in case both events happen in the same observation */
  duration = &end_date. - begin_date;
  date = &end_date.;
  output;
end;
keep id begin_date date duration previous_default diff_default;
run;
Franci941
Calcite | Level 5
Sorry to bother you, but they change a little what professor ask to me. The range of days that I want:
- between 1'default (first accurance) and 1'bonis (first accurance)
- between 1'bonis (first accurance) and 1'default (first accurance)
There is also the possibility of two passage bonis-default (like 0 0 1 0 0 0 1 1) - > in this case I want these intervals:
- between 1'bonis (first accurance) and 1'default (first accurance)
- between 1'default (first accurance) and 1'bonis (first accurance)
- between 2'bonis (first accurance) and 2'default (first accurance)

And in the same way, There is also the possibility of two passage default-bonis (like 1 1 0 0 1 0 0 0) - > in this case I want these intervals:
- between 1'def(first accurance) and 1'bonis(first accurance)
- between 1'bonis (first accurance) and 1'def (first accurance)
- between 2'def (first accurance) and 2'bonis (first accurance)

Thank you so much for your help! I try all the day but my code doesn't give me the correct output..

Kurt_Bremser
Super User

Expanding the step with additional conditions should not pose a big problem now. The principle stays the same: retain a variable, use the proper condition(s) to initialize/set it, and the proper condition to evaluate it.

Franci941
Calcite | Level 5
My code is like:

data prova2;
set ew;
by id;
retain begin_date begin_date2 begin_date3;
format begin_date ddmmyy10. begin_date2 ddmmyy10. begin_date3 ddmmyy10.;
if id then do;
begin_date=.;
begin_date2=.;
begin_date3=.;
end;
if stato=1 and begin_date=. then do;
if begin_date2 ne . then do;
days_bon_def2 = date - begin_date2;
output;
end;
if begin_date3 ne . then do;
days_bon_def= date - begin_date3;
output;
begin_date3=.;
end;
begin_date= date;
begin_date2= date;
end;
if stato=0 then do;
if begin_date ne . then do;
days_def_bon= date - begin_date;
output;
begin_date=.;
end;
if begin_date3 =. then begin_date3=date;
end;
keep id begin_date begin_date2 begin_date3 days_def_bon days_bon_def days_bon_def2;
run;
Franci941
Calcite | Level 5
(The two passage bonis-default is missing)

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1819 views
  • 0 likes
  • 3 in conversation