Hello,
I have a dataset that has patient ID, Date of service and a list of procedures that patient received (proc1 to proc5, in reality the procedure list goes to 22 but I thought for this 5 would suffice, hopefully I am right).
What I need to do is determine how many of the patients who received a specific procedure, H0014 in this case, also received the other procedures within 14 days.
So in the example table below I would want patient 2 , patient 3 (with the date of 2/11/17) and patient 4 (both the 3/4/17 and 4/13/17 visits) to be flagged.
Data Have:
Pat_ID | Date_of_service | Proc1 | Proc2 | Proc3 | Proc4 | Proc5 |
1 | 9/25/2017 | 99213 | ||||
1 | 10/4/2017 | 99213 | ||||
1 | 10/11/2017 | H0014 | ||||
2 | 10/20/2017 | H0014 | ||||
2 | 11/3/2017 | 99214 | ||||
3 | 12/28/2017 | 99214 | ||||
3 | 2/4/2017 | 99214 | ||||
3 | 2/11/2017 | H0014 | 99213 | |||
3 | 3/11/2017 | 99215 | ||||
3 | 3/15/2017 | 99213 | ||||
4 | 3/4/2017 | H0014 | ||||
4 | 3/10/2017 | 99214 | ||||
4 | 4/10/2017 | 99213 | ||||
4 | 4/13/2017 | H0014 | ||||
4 | 4/14/2017 | 99216 |
Data Want:
here I have added a flag column that using 1 or 2 that indicates which of the H0014 row is coupled with the row that has the other procs within 14 days (this would be with a 2). Of course if there are two (or more procs) in one row with H0014 then the flag with 1 will suffice.
Pat_ID | Date_of_service | Proc1 | Proc2 | Proc3 | Proc4 | Proc5 | Flag |
1 | 9/25/2017 | 99213 | 0 | ||||
1 | 10/4/2017 | 99213 | 0 | ||||
1 | 10/11/2017 | H0014 | 0 | ||||
2 | 10/20/2017 | H0014 | 1 | ||||
2 | 11/3/2017 | 99214 | 2 | ||||
3 | 12/28/2017 | 99214 | 0 | ||||
3 | 2/4/2017 | 99214 | 0 | ||||
3 | 2/11/2017 | H0014 | 99213 | 1 | |||
3 | 3/11/2017 | 99215 | 0 | ||||
3 | 3/15/2017 | 99213 | 0 | ||||
4 | 3/4/2017 | H0014 | 1 | ||||
4 | 3/10/2017 | 99214 | 2 | ||||
4 | 4/10/2017 | 99213 | 0 | ||||
4 | 4/13/2017 | H0014 | 1 | ||||
4 | 4/14/2017 | 99216 | 2 |
is this too complicated to do?
Thank you in advance!!!
You have to take more than one pass thru the data. I think the following does what you want:
data have; informat Pat_ID $2. Date_of_service mmddyy10. Proc1 $5. Proc2 $5. Proc3 $5. Proc4 $5. Proc5 $5.; input Pat_ID Date_of_service Proc1 Proc2 Proc3 Proc4 Proc5 ; format Date_of_service date9.; datalines; A1 092517 99213 . . . . A1 100417 99213 . . . . A1 101117 H0014 . . . . A2 102017 H0014 . . . . A2 110317 . 99214 . . . A3 122817 99214 . . . . A3 020417 99214 . . . . A3 021117 H0014 99213 . . . A3 031117 . . 99215 . . A3 031517 99213 . . . . A4 030417 H0014 . . . . A4 031017 . . . 99214 . A4 041017 . 99213 . . . A4 041317 . . H0014 . . A4 041417 99216 . . . . ; run; data need (drop=first_date procs: i j); set have; by Pat_ID; array inprocs(5) proc1-proc5; array procs(5); retain first_date; flag=0; if first.Pat_ID then call missing(first_date); if 'H0014' in inprocs then do; flag=1; first_date=Date_of_service; end; else if not missing(first_date) and Date_of_service-first_date le 14 then do; j=0; do i=1 to 5; if inprocs(i) ne 'H0014' then do; if not missing(inprocs(i)) then do; procs(i)=inprocs(i); j+1; end; end; end; if j gt 0 then do; flag=2; call missing(first_date); end; end; else flag=0; obs=_n_; run; proc sort data=need; by descending obs; run; data need (drop=foundone); set need; by descending Pat_ID; array inprocs(5) proc1-proc5; retain foundone; if first.Pat_ID then foundone=0; if not foundone and 'H0014' in inprocs then do; flag=0; foundone=1; end; else if flag eq 2 then foundone=1; run; proc sort data=need out=want (drop=obs); by obs; run;
Art, CEO, AnalystFinder.com
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
I say that because this process gets much easier if the data is moved to a single record of patient, date, procedure but I'm not going to take the time to attempt to create a data set from your example.
I think that something like the following would do what you want:
data want (drop=first_date procs: i j); set have; by Pat_ID; array inprocs(5) proc1-proc5; array procs(5); retain first_date; flag=0; if first.Pat_ID then call missing(first_date); if 'H0014' in inprocs then do; flag=1; first_date=Date_of_service; end; else if not missing(first_date) and Date_of_service-first_date le 14 then do; j=0; do i=1 to 5; if inprocs(i) ne 'H0014' then do; if not missing(inprocs(i)) then do; procs(i)=inprocs(i); j+1; end; end; end; if j gt 0 then do; flag=2; call missing(Date_of_service); end; end; else flag=0; run;
Art, CEO, AnalystFinder.com
@art297 your code almost worked. Sorry I made the mistake of giving you pat_id as numeric when in reality it was character so the first (pat_id) with a character variable didn't work. But if I replace that first(pat_id) to first.pat_id it sets some of the dates as missing, not sure why.
Suggestions on how to fix?
Thank you so much!
Here is the code:
data have; informat Pat_ID $2. Date_of_service mmddyy10. Proc1 $5. Proc2 $5. Proc3 $5. Proc4 $5. Proc5 $5.; input Pat_ID Date_of_service Proc1 Proc2 Proc3 Proc4 Proc5 ; format Date_of_service date9.; datalines; A1 092517 99213 . . . . A1 100417 99213 . . . . A1 101117 H0014 . . . . A2 102017 H0014 . . . . A2 110317 . 99214 . . . A3 122817 99214 . . . . A3 020417 99214 . . . . A3 021117 H0014 99213 . . . A3 031117 . . 99215 . . A3 031517 99213 . . . . A4 030417 H0014 . . . . A4 031017 . . . 99214 . A4 041017 . 99213 . . . A4 041317 . . H0014 . . A4 041417 99216 . . . . ; run;
It didn't work because I had another error in the code. Try:
data have; informat Pat_ID $2. Date_of_service mmddyy10. Proc1 $5. Proc2 $5. Proc3 $5. Proc4 $5. Proc5 $5.; input Pat_ID Date_of_service Proc1 Proc2 Proc3 Proc4 Proc5 ; format Date_of_service date9.; datalines; A1 092517 99213 . . . . A1 100417 99213 . . . . A1 101117 H0014 . . . . A2 102017 H0014 . . . . A2 110317 . 99214 . . . A3 122817 99214 . . . . A3 020417 99214 . . . . A3 021117 H0014 99213 . . . A3 031117 . . 99215 . . A3 031517 99213 . . . . A4 030417 H0014 . . . . A4 031017 . . . 99214 . A4 041017 . 99213 . . . A4 041317 . . H0014 . . A4 041417 99216 . . . . ; run; data want (drop=first_date procs: i j); set have; by Pat_ID; array inprocs(5) proc1-proc5; array procs(5); retain first_date; flag=0; if first.Pat_ID then call missing(first_date); if 'H0014' in inprocs then do; flag=1; first_date=Date_of_service; end; else if not missing(first_date) and Date_of_service-first_date le 14 then do; j=0; do i=1 to 5; if inprocs(i) ne 'H0014' then do; if not missing(inprocs(i)) then do; procs(i)=inprocs(i); j+1; end; end; end; if j gt 0 then do; flag=2; call missing(first_date); end; end; else flag=0; run;
Art, CEO, AnalystFinder.com
@art297 you rock!!!!
I think it did it. However, why does A1 get flagged? Is this because the calculation of within 14 days is regardless of when H0014 procedure is reported? So if we wanted to make the rule where H0014 has to happen on the date or within 14 days before the other procedures are reported do we change the by to add mbr_id and procs or do we add it in the "first" command?
Thank you!!!!!
You are fabulous!!!
You have to take more than one pass thru the data. I think the following does what you want:
data have; informat Pat_ID $2. Date_of_service mmddyy10. Proc1 $5. Proc2 $5. Proc3 $5. Proc4 $5. Proc5 $5.; input Pat_ID Date_of_service Proc1 Proc2 Proc3 Proc4 Proc5 ; format Date_of_service date9.; datalines; A1 092517 99213 . . . . A1 100417 99213 . . . . A1 101117 H0014 . . . . A2 102017 H0014 . . . . A2 110317 . 99214 . . . A3 122817 99214 . . . . A3 020417 99214 . . . . A3 021117 H0014 99213 . . . A3 031117 . . 99215 . . A3 031517 99213 . . . . A4 030417 H0014 . . . . A4 031017 . . . 99214 . A4 041017 . 99213 . . . A4 041317 . . H0014 . . A4 041417 99216 . . . . ; run; data need (drop=first_date procs: i j); set have; by Pat_ID; array inprocs(5) proc1-proc5; array procs(5); retain first_date; flag=0; if first.Pat_ID then call missing(first_date); if 'H0014' in inprocs then do; flag=1; first_date=Date_of_service; end; else if not missing(first_date) and Date_of_service-first_date le 14 then do; j=0; do i=1 to 5; if inprocs(i) ne 'H0014' then do; if not missing(inprocs(i)) then do; procs(i)=inprocs(i); j+1; end; end; end; if j gt 0 then do; flag=2; call missing(first_date); end; end; else flag=0; obs=_n_; run; proc sort data=need; by descending obs; run; data need (drop=foundone); set need; by descending Pat_ID; array inprocs(5) proc1-proc5; retain foundone; if first.Pat_ID then foundone=0; if not foundone and 'H0014' in inprocs then do; flag=0; foundone=1; end; else if flag eq 2 then foundone=1; run; proc sort data=need out=want (drop=obs); by obs; run;
Art, CEO, AnalystFinder.com
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.