I have a long dataset with 1 to multiple (not fixed) rows per person_id. There are a number of variables but the ones of interest are person_id, start_date and end_date. The dataset is sorted according to person_id, start_date, end_date.
I have identified an indicator case/s (flag=1) for each person_id. This may not be the first case (based on start_date) and an person_id may have none or multiple indicator cases.
I would like to calculate the number of days between the end_date of the indicator case and each subsequent start_date until the next indicator within person_id or until next person_id.
The below table shows the number of days column I would like to compute.
person_id | start_date | end_date | flag | Number of days |
1 | 02AUG2020 | 05AUG2020 |
|
|
1 | 08AUG2020 | 20AUG2020 | 1 |
|
1 | 10AUG2020 | 30AUG2020 |
| -10 |
1 | 21AUG2020 | 21AUG2020 |
| 1 |
2 | 02AUG2020 | 05AUG2020 |
|
|
2 | 08AUG2020 | 10AUG2020 | 1 |
|
2 | 20AUG2020 | 21AUG2020 |
| 10 |
2 | 21AUG2020 | 21AUG2020 | 1 |
|
2 | 23AUG2020 | 31AUG2020 |
| 2 |
3 | 08AUG2020 | 10AUG2020 |
|
|
3 | 10AUG2020 | 21AUG2020 |
|
|
4 | 02AUG2020 | 05AUG2020 | 1 |
|
4 | 05AUG2020 | 21AUG2020 |
| 0 |
4 | 21AUG2020 | 24AUG2020 |
| 16 |
5 | 08AUG2020 | 10AUG2020 |
|
|
5 | 20AUG2020 | 21AUG2020 | 1 |
|
Thanks heaps in advance
I'm assuming:
data want;
set have;
by person_id start_date end_date;
*stores the end_date in the indicator start variable and carries it across the rows;
retain indicator_start;
if flag=1 then indicator_start = end_date;
*resets for the first record of each person;
if first.person_id then call missing(indicator_start);
*conditionally calculate duration;
if not missing(indicator_start) and flag ne 1 then num_days = end_date - indicator_start;
run;
Untested code, but should get you started. If it doesn't work, please post the code you used and the log and indicate exactly what did not work.
@JenWalker1 wrote:
I have a long dataset with 1 to multiple (not fixed) rows per person_id. There are a number of variables but the ones of interest are person_id, start_date and end_date. The dataset is sorted according to person_id, start_date, end_date.
I have identified an indicator case/s (flag=1) for each person_id. This may not be the first case (based on start_date) and an person_id may have none or multiple indicator cases.
I would like to calculate the number of days between the end_date of the indicator case and each subsequent start_date until the next indicator within person_id or until next person_id.
The below table shows the number of days column I would like to compute.
person_id
start_date
end_date
flag
Number of days
1
02AUG2020
05AUG2020
1
08AUG2020
20AUG2020
1
1
10AUG2020
30AUG2020
-10
1
21AUG2020
21AUG2020
1
2
02AUG2020
05AUG2020
2
08AUG2020
10AUG2020
1
2
20AUG2020
21AUG2020
10
2
21AUG2020
21AUG2020
1
2
23AUG2020
31AUG2020
2
3
08AUG2020
10AUG2020
3
10AUG2020
21AUG2020
4
02AUG2020
05AUG2020
1
4
05AUG2020
21AUG2020
0
4
21AUG2020
24AUG2020
16
5
08AUG2020
10AUG2020
5
20AUG2020
21AUG2020
1
Thanks heaps in advance
I'm assuming:
data want;
set have;
by person_id start_date end_date;
*stores the end_date in the indicator start variable and carries it across the rows;
retain indicator_start;
if flag=1 then indicator_start = end_date;
*resets for the first record of each person;
if first.person_id then call missing(indicator_start);
*conditionally calculate duration;
if not missing(indicator_start) and flag ne 1 then num_days = end_date - indicator_start;
run;
Untested code, but should get you started. If it doesn't work, please post the code you used and the log and indicate exactly what did not work.
@JenWalker1 wrote:
I have a long dataset with 1 to multiple (not fixed) rows per person_id. There are a number of variables but the ones of interest are person_id, start_date and end_date. The dataset is sorted according to person_id, start_date, end_date.
I have identified an indicator case/s (flag=1) for each person_id. This may not be the first case (based on start_date) and an person_id may have none or multiple indicator cases.
I would like to calculate the number of days between the end_date of the indicator case and each subsequent start_date until the next indicator within person_id or until next person_id.
The below table shows the number of days column I would like to compute.
person_id
start_date
end_date
flag
Number of days
1
02AUG2020
05AUG2020
1
08AUG2020
20AUG2020
1
1
10AUG2020
30AUG2020
-10
1
21AUG2020
21AUG2020
1
2
02AUG2020
05AUG2020
2
08AUG2020
10AUG2020
1
2
20AUG2020
21AUG2020
10
2
21AUG2020
21AUG2020
1
2
23AUG2020
31AUG2020
2
3
08AUG2020
10AUG2020
3
10AUG2020
21AUG2020
4
02AUG2020
05AUG2020
1
4
05AUG2020
21AUG2020
0
4
21AUG2020
24AUG2020
16
5
08AUG2020
10AUG2020
5
20AUG2020
21AUG2020
1
Thanks heaps in advance
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.