Solved: Backward and Forward Filling Information

MikkiW · Posted 11-17-2020 04:25 PM

Hello! I am working with health data in the following format:

ID = Patient ID

Date = Date of Procedure

DocIDProcedure1 = the ID number of the doctor that performed the 1st procedure

DocIDProcedure2 = the ID number of the doctor that performed the 2nd procedure

DocIDProcedure3 = the ID number of the doctor that performed the 3rd procedure

This is an abbreviated list. I have a list of over 50 procedures for each row.

The data looks like the following:

ID Date DocIDProcedure1 DocIDProcedure2 DocIDProcedure3

1 Nov112020 1234

1 Nov112020 3456

1 Nov112020 7890

2 Dec032020 2465

2 Dec032020 1170

What I would like to do:

For each ID/Date combination, I would like to back and forward fill the doctor IDs so that they show up on each line (for the same ID and date).

So, I would like the above dataset to look like the following:

ID Date DocIDProcedure1 DocIDProcedure2 DocIDProcedure3

1 Nov112020 1234 3456 7890

2 Dec032020 1170 2465

This is really stumping me. Any help would be greatly appreciated!

Mikki

novinosrin · Posted 11-17-2020 04:40 PM


data have;
 input (ID   Date ) ($) DocIDProcedure1  DocIDProcedure2  DocIDProcedure3;
 cards;
1            Nov112020  1234   .   .
1            Nov112020   .   3456  .
1            Nov112020   .  .    7890
2            Dec032020   .  .    2465
2            Dec032020   .  1170  .
;

data want;
 do _n_=1 by 1 until(last.date);
    update have(obs=0) have;
    by id date;
 end;
 do _n_=1 to _n_;
    output;
 end;
run;

proc print noobs;run;

ID	Date	DocIDProcedure1	DocIDProcedure2	DocIDProcedure3
1	Nov11202	1234	3456	7890
1	Nov11202	1234	3456	7890
1	Nov11202	1234	3456	7890
2	Dec03202	.	1170	2465
2	Dec03202	.	1170	2465

View solution in original post

novinosrin · Posted 11-17-2020 04:40 PM


data have;
 input (ID   Date ) ($) DocIDProcedure1  DocIDProcedure2  DocIDProcedure3;
 cards;
1            Nov112020  1234   .   .
1            Nov112020   .   3456  .
1            Nov112020   .  .    7890
2            Dec032020   .  .    2465
2            Dec032020   .  1170  .
;

data want;
 do _n_=1 by 1 until(last.date);
    update have(obs=0) have;
    by id date;
 end;
 do _n_=1 to _n_;
    output;
 end;
run;

proc print noobs;run;

ID	Date	DocIDProcedure1	DocIDProcedure2	DocIDProcedure3
1	Nov11202	1234	3456	7890
1	Nov11202	1234	3456	7890
1	Nov11202	1234	3456	7890
2	Dec03202	.	1170	2465
2	Dec03202	.	1170	2465

MikkiW · Posted 11-18-2020 09:29 AM

THANK YOU so much Novinosrin! The code worked GREAT! I would not have thought of this myself! I truly appreciate your help!

Tom · Posted 11-17-2020 05:22 PM

Can we assume that for any given block of ID*DATE combinations there is only one non missing value in each of those ...Proc... variables?

Why not just collapse into one observation per block? Are there other variables you didn't mention whose values vary over the observations in a block?

r_behata · Posted 11-17-2020 05:50 PM

data have;
	input (ID   Date ) ($) DocIDProcedure1  DocIDProcedure2  DocIDProcedure3;
	cards;
1            Nov112020  1234   .   .
1            Nov112020   .   3456  .
1            Nov112020   .  .    7890
2            Dec032020   .  .    2465
2            Dec032020   .  1170  .
;
run;

data want(rename=(DocIDProcedure1_=DocIDProcedure1 DocIDProcedure2_=DocIDProcedure2 DocIDProcedure3_=DocIDProcedure3));
	do until(last.id);
		set have;
		by id;
		DocIDProcedure1_=coalesce(DocIDProcedure1_,DocIDProcedure1);
		DocIDProcedure2_=coalesce(DocIDProcedure2_,DocIDProcedure2);
		DocIDProcedure3_=coalesce(DocIDProcedure3_,DocIDProcedure3);
	end;

	do until(last.id);
		set have;
		by id;
		output;
	end;

	drop DocIDProcedure1 DocIDProcedure2 DocIDProcedure3;
run;

MikkiW · Posted 11-18-2020 09:43 AM

Thank you so much for thinking about this r_behata! I will also try out this code in the next couple of days. I have over 50 of those DocID... variables. So, it will take a little longer to write the full code. I TRULY appreciate your help though!

Ksharp · Posted 11-18-2020 07:19 AM

data have;
	input (ID   Date ) ($) DocIDProcedure1  DocIDProcedure2  DocIDProcedure3;
	cards;
1            Nov112020  1234   .   .
1            Nov112020   .   3456  .
1            Nov112020   .  .    7890
2            Dec032020   .  .    2465
2            Dec032020   .  1170  .
;
run;
data have;
 set have;
 dummy='x';
run;
proc sql;
create table want(drop=dummy) as
select id,date,dummy,
 max(DocIDProcedure1) as DocIDProcedure1,
 max(DocIDProcedure2) as DocIDProcedure2,
 max(DocIDProcedure3) as DocIDProcedure3
  from have
   group by id,date;
quit;

Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Re: Backward and Forward Filling Information

Registration is open

SAS Training: Just a Click Away