I have a table like this
patient id | Surgery date |
1 | 16Aug2013 |
2 | 07Aug2013 |
2 | 09Aug2013 |
I would like to create two variables as the following table. 'After surgery days' field starts with 0, if there are more than 2 surgeries and the dates between them are more than 1 day, add date, 'After surgery days' between them; if less than 5 days, added at the end for date, and 'After surgery days' . Can anyone help me? Thanks.
patient id | Surgery date | date | After surgery days |
1 | 16Aug2013 | 16Aug2013 | 0 |
1 | 17Aug2013 | 1 | |
1 | 18Aug2013 | 2 | |
1 | 19Aug2013 | 3 | |
1 | 20Aug2013 | 4 | |
2 | 07Aug2013 | 07Aug2013 | 0 |
2 | 08Aug2013 | 1 | |
2 | 09Aug2013 | 09Aug2013 | 2 |
2 | 10Aug2013 | 3 | |
2 | 11Aug2013 | 4 |
What if a patient had 2 surgeries, but they were 10 days apart? Do you want a set of five days (beginning with 0) for each surgery for that patient?
If I correctly understand what you are trying to do, the following will accomplish the task:
data have; input patient_id Surgery_date date9.; format Surgery_date date9.; cards; 1 16Aug2013 2 07Aug2013 2 09Aug2013 3 10Aug2013 3 20Aug2013 ; data want (drop=i day:); set have; array surgeries(10) _temporary_; by patient_id; if first.patient_id then do; call missing(of surgeries(*)); i=0; end; i+1; surgeries(i)=Surgery_date; if last.patient_id then do; if i eq 1 then days=4; else days=surgeries(i)-surgeries(1); do After_surgery_days=0 to days; day=surgeries(1)+After_surgery_days; if day in surgeries then Surgery_date=day; else call missing(Surgery_date); output; end; end; run;
Art, CEO, AnalystFinder.com
Art 297: Thank you very much for this! This almost accomplish what I need. But I would like to have fixed length of day for each patient . For example, length of 15 days or more for these all of patients which means the 1st one add 14 days after the 1st surgery; the 2nd one needs insert 1 day between two surgery days and 12 days after the 2nd surgery; the 3rd one need to insert 9 days between two surgery and add 4 days at the end.
@grace999: based on the example data I provided, please post what you want the output file to look like.
Art, CEO, AnalystFinder.com
@art297 The output looks like the following
patient_id | Surgery_date | day | After_surgery_days |
1 | 16Aug2013 | 16Aug2013 | 0 |
1 | 17Aug2013 | 1 | |
1 | 18Aug2013 | 2 | |
1 | 19Aug2013 | 3 | |
1 | 20Aug2013 | 4 | |
1 | 21Aug2013 | 5 | |
1 | 22Aug2013 | 6 | |
1 | 23Aug2013 | 7 | |
1 | 24Aug2013 | 8 | |
1 | 25Aug2013 | 9 | |
1 | 26Aug2013 | 10 | |
1 | 27Aug2013 | 11 | |
1 | 28Aug2013 | 12 | |
1 | 29Aug2013 | 13 | |
1 | 30Aug2013 | 14 | |
2 | 07Aug2013 | 07Aug2013 | 0 |
2 | 08Aug2013 | 1 | |
2 | 09Aug2013 | 09Aug2013 | 2 |
2 | 10Aug2013 | 3 | |
2 | 11Aug2013 | 4 | |
2 | 12Aug2013 | 5 | |
2 | 13Aug2013 | 6 | |
2 | 14Aug2013 | 7 | |
2 | 15Aug2013 | 8 | |
2 | 16Aug2013 | 9 | |
2 | 17Aug2013 | 10 | |
2 | 18Aug2013 | 11 | |
2 | 19Aug2013 | 12 | |
2 | 20Aug2013 | 13 | |
2 | 21Aug2013 | 14 | |
3 | 10Aug2013 | 10Aug2013 | 0 |
3 | 11Aug2013 | 1 | |
3 | 12Aug2013 | 2 | |
3 | 13Aug2013 | 3 | |
3 | 14Aug2013 | 4 | |
3 | 15Aug2013 | 5 | |
3 | 16Aug2013 | 6 | |
3 | 17Aug2013 | 7 | |
3 | 18Aug2013 | 8 | |
3 | 19Aug2013 | 9 | |
3 | 20Aug2013 | 20Aug2013 | 10 |
3 | 21Aug2013 | 11 | |
3 | 22Aug2013 | 12 | |
3 | 23Aug2013 | 13 | |
3 | 24Aug2013 | 14 |
It took me a while to come up with this (and it's untested ... can't test it until Wednesday), but here goes. This assumes that all dates are valid SAS dates:
data want;
set have;
by patient_ID;
if first.patient_ID then after_surgery_days=0;
else after_surgery_days + 1;
output;
surgery_date = .;
if last.patient_ID then do after_surgery_days = after_surgery_days + 1 to 14;
after_surgery_days + 1;
output;
end;
else do;
next_record = _n_ + 1;
set have point=next_record (keep=surgery_date rename=(surgery_date = next_date));
do after_surgery_days = after_surgery_days + 1 to next_date - 1;
output;
end;
end;
drop next_date next_record;
run;
The logic relies on SAS automatically skipping loops that run backwards, so it definitely requires testing to confirm that the values are coming out correctly for AFTER_SURGERY_DAYS.
I added a 3rd surgery for patient_id 3. Let us know if the following does what you want:
data have; input patient_id Surgery_date date9.; format Surgery_date date9.; cards; 1 16Aug2013 2 07Aug2013 2 09Aug2013 3 10Aug2013 3 20Aug2013 3 30Aug2013 ; data want (drop=i day:); set have; array surgeries(10) _temporary_; by patient_id; if first.patient_id then do; call missing(of surgeries(*)); i=0; end; i+1; surgeries(i)=Surgery_date; if last.patient_id then do; days=14; if i gt 1 and surgeries(i)-surgeries(1) gt 14 then days=surgeries(i)-surgeries(1); do After_surgery_days=0 to days; day=surgeries(1)+After_surgery_days; if day in surgeries then Surgery_date=day; else call missing(Surgery_date); output; end; end; run;
Art, CEO, AnalystFinder.com
@art297 Thanks a lot! Could you please help me to understand array surgeries(10. what does "
if day in surgeries " mean?
The surgeries array has to have the maximum number of surgeries that a patient might have. It has nothing to do with the number of records that you have.
As for the line:
if day in surgeries then Surgery_date=day;
the code, at that point, is looping through all of the dates being output and that line is simply checking to see if a surgery happened (for that patient) on the date being output. If it was, it adds the date to the surgery_date variable.
Art, CEO, AnalystFinder.com
@art297 Thank you so much explaining this. I only used array for variables, not for records as you did for surgeries within each patient. It's good to know!
Hi, using Art's data, here's another idea that get's the same results as Art's last posting ...
data x;
input id date date9. @@;
format date date9.;
datalines;
1 16Aug2013
2 07Aug2013 2 09Aug2013
3 10Aug2013 3 20Aug2013 3 30Aug2013
;
data y (drop=first s: diff);
array s(0:30) s0-s30;
do until(last.id);
set x;
by id;
if first.id then first = date;
s(date - first) = date;
end;
diff = coalesce(of s30-s0)-s0;
do after=0 to ifn(diff gt 14, diff, 14);
date = s(after);
output;
end;
run;
@MikeZdeb Thank you very much for giiving me another solution. Would you please add one column 'day' variable in it?
patient_id | Surgery_date | day | After_surgery_days |
1 | 16Aug2013 | 16Aug2013 | 0 |
1 | 17Aug2013 | 1 | |
1 | 18Aug2013 | 2 | |
1 | 19Aug2013 | 3 | |
1 | 20Aug2013 | 4 | |
1 | 21Aug2013 | 5 | |
1 | 22Aug2013 | 6 | |
1 | 23Aug2013 | 7 | |
1 | 24Aug2013 | 8 | |
1 | 25Aug2013 | 9 | |
1 | 26Aug2013 | 10 | |
1 | 27Aug2013 | 11 | |
1 | 28Aug2013 | 12 | |
1 | 29Aug2013 | 13 | |
1 | 30Aug2013 | 14 | |
2 | 07Aug2013 | 07Aug2013 | 0 |
2 | 08Aug2013 | 1 | |
2 | 09Aug2013 | 09Aug2013 | 2 |
2 | 10Aug2013 | 3 | |
2 | 11Aug2013 | 4 | |
2 | 12Aug2013 | 5 | |
2 | 13Aug2013 | 6 | |
2 | 14Aug2013 | 7 | |
2 | 15Aug2013 | 8 | |
2 | 16Aug2013 | 9 | |
2 | 17Aug2013 | 10 | |
2 | 18Aug2013 | 11 | |
2 | 19Aug2013 | 12 | |
2 | 20Aug2013 | 13 | |
2 | 21Aug2013 | 14 | |
3 | 10Aug2013 | 10Aug2013 | 0 |
3 | 11Aug2013 | 1 | |
3 | 12Aug2013 | 2 | |
3 | 13Aug2013 | 3 | |
3 | 14Aug2013 | 4 | |
3 | 15Aug2013 | 5 | |
3 | 16Aug2013 | 6 | |
3 | 17Aug2013 | 7 | |
3 | 18Aug2013 | 8 | |
3 | 19Aug2013 | 9 | |
3 | 20Aug2013 | 20Aug2013 | 10 |
3 | 21Aug2013 | 11 | |
3 | 22Aug2013 | 12 | |
3 | 23Aug2013 | 13 | |
3 | 24Aug2013 | 14 |
Code with a day variable added:
data have; input patient_id Surgery_date date9.; format Surgery_date date9.; cards; 1 16Aug2013 2 07Aug2013 2 09Aug2013 3 10Aug2013 3 20Aug2013 3 30Aug2013 ; data want (drop=i days); set have; array surgeries(10) _temporary_; by patient_id; format day date9.; if first.patient_id then do; call missing(of surgeries(*)); i=0; end; i+1; surgeries(i)=Surgery_date; if last.patient_id then do; days=14; if i gt 1 and surgeries(i)-surgeries(1) gt 14 then days=surgeries(i)-surgeries(1); do After_surgery_days=0 to days; day=surgeries(1)+After_surgery_days; if day in surgeries then Surgery_date=day; else call missing(Surgery_date); output; end; end; run;
Art, CEO, AnalystFinder.com
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.