BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JenWalker1
Calcite | Level 5

I have a long dataset with 1 to multiple (not fixed) rows per person_id. There are a number of variables but the ones of interest are person_id, start_date and end_date. The dataset is sorted according to person_id, start_date, end_date.

I have identified an indicator case/s (flag=1) for each person_id. This may not be the first case (based on start_date) and an person_id may have none or multiple indicator cases.

I would like to calculate the number of days between the end_date of the indicator case and each subsequent start_date until the next indicator within person_id or until next person_id.

The below table shows the number of days column I would like to compute.

person_id

start_date

end_date

flag

Number of days

1

02AUG2020

05AUG2020

 

 

1

08AUG2020

20AUG2020

1

 

1

10AUG2020

30AUG2020

 

-10

1

21AUG2020

21AUG2020

 

1

2

02AUG2020

05AUG2020

 

 

2

08AUG2020

10AUG2020

1

 

2

20AUG2020

21AUG2020

 

10

2

21AUG2020

21AUG2020

1

 

2

23AUG2020

31AUG2020

 

2

3

08AUG2020

10AUG2020

 

 

3

10AUG2020

21AUG2020

 

 

4

02AUG2020

05AUG2020

1

 

4

05AUG2020

21AUG2020

 

0

4

21AUG2020

24AUG2020

 

16

5

08AUG2020

10AUG2020

 

 

5

20AUG2020

21AUG2020

1

 

 

Thanks heaps in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I'm assuming:

  • you have the first 4 columns in the data set as your HAVE data set
  • the dates are SAS dates, ie numeric with a DATE format (not datetime, not character)
  • Order is as shown and no sort is required

 

 

data want;
set have;
by person_id start_date end_date;

*stores the end_date in the indicator start variable and carries it across the rows;
retain indicator_start;
if flag=1 then indicator_start = end_date;

*resets for the first record of each person;
if first.person_id then call missing(indicator_start);

*conditionally calculate duration;
if not missing(indicator_start) and flag ne 1 then num_days = end_date - indicator_start;

run;

Untested code, but should get you started. If it doesn't work, please post the code you used and the log and indicate exactly what did not work. 

 


@JenWalker1 wrote:

I have a long dataset with 1 to multiple (not fixed) rows per person_id. There are a number of variables but the ones of interest are person_id, start_date and end_date. The dataset is sorted according to person_id, start_date, end_date.

I have identified an indicator case/s (flag=1) for each person_id. This may not be the first case (based on start_date) and an person_id may have none or multiple indicator cases.

I would like to calculate the number of days between the end_date of the indicator case and each subsequent start_date until the next indicator within person_id or until next person_id.

The below table shows the number of days column I would like to compute.

person_id

start_date

end_date

flag

Number of days

1

02AUG2020

05AUG2020

 

 

1

08AUG2020

20AUG2020

1

 

1

10AUG2020

30AUG2020

 

-10

1

21AUG2020

21AUG2020

 

1

2

02AUG2020

05AUG2020

 

 

2

08AUG2020

10AUG2020

1

 

2

20AUG2020

21AUG2020

 

10

2

21AUG2020

21AUG2020

1

 

2

23AUG2020

31AUG2020

 

2

3

08AUG2020

10AUG2020

 

 

3

10AUG2020

21AUG2020

 

 

4

02AUG2020

05AUG2020

1

 

4

05AUG2020

21AUG2020

 

0

4

21AUG2020

24AUG2020

 

16

5

08AUG2020

10AUG2020

 

 

5

20AUG2020

21AUG2020

1

 

 

Thanks heaps in advance


 

View solution in original post

2 REPLIES 2
Reeza
Super User

I'm assuming:

  • you have the first 4 columns in the data set as your HAVE data set
  • the dates are SAS dates, ie numeric with a DATE format (not datetime, not character)
  • Order is as shown and no sort is required

 

 

data want;
set have;
by person_id start_date end_date;

*stores the end_date in the indicator start variable and carries it across the rows;
retain indicator_start;
if flag=1 then indicator_start = end_date;

*resets for the first record of each person;
if first.person_id then call missing(indicator_start);

*conditionally calculate duration;
if not missing(indicator_start) and flag ne 1 then num_days = end_date - indicator_start;

run;

Untested code, but should get you started. If it doesn't work, please post the code you used and the log and indicate exactly what did not work. 

 


@JenWalker1 wrote:

I have a long dataset with 1 to multiple (not fixed) rows per person_id. There are a number of variables but the ones of interest are person_id, start_date and end_date. The dataset is sorted according to person_id, start_date, end_date.

I have identified an indicator case/s (flag=1) for each person_id. This may not be the first case (based on start_date) and an person_id may have none or multiple indicator cases.

I would like to calculate the number of days between the end_date of the indicator case and each subsequent start_date until the next indicator within person_id or until next person_id.

The below table shows the number of days column I would like to compute.

person_id

start_date

end_date

flag

Number of days

1

02AUG2020

05AUG2020

 

 

1

08AUG2020

20AUG2020

1

 

1

10AUG2020

30AUG2020

 

-10

1

21AUG2020

21AUG2020

 

1

2

02AUG2020

05AUG2020

 

 

2

08AUG2020

10AUG2020

1

 

2

20AUG2020

21AUG2020

 

10

2

21AUG2020

21AUG2020

1

 

2

23AUG2020

31AUG2020

 

2

3

08AUG2020

10AUG2020

 

 

3

10AUG2020

21AUG2020

 

 

4

02AUG2020

05AUG2020

1

 

4

05AUG2020

21AUG2020

 

0

4

21AUG2020

24AUG2020

 

16

5

08AUG2020

10AUG2020

 

 

5

20AUG2020

21AUG2020

1

 

 

Thanks heaps in advance


 

JenWalker1
Calcite | Level 5
Thank you so much! I just tweaked the number of days calculation and it ran perfect!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 433 views
  • 0 likes
  • 2 in conversation