DATA Step, Macro, Functions and more

Replace missing values with multiple values

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Replace missing values with multiple values

I have been struggling with this.  I have sequential data that is missing some rows...and I need to fill them in.

 

I have:

ColA  ColB  ColC  Diff

A           B       C      1

A           B       C      2

A           B       C      4

D           E       F      1

D           E       F      3

D           E       F       4

 

I want:

ColA  ColB  ColC  Diff

A           B       C      1

A           B       C      2

A           B       C      3

A           B       C      4

D           E       F      1

D           E       F      2

D           E       F      3

D           E       F      4


Accepted Solutions
Solution
‎02-01-2018 12:16 PM
Super User
Posts: 10,623

Re: Replace missing values with multiple values

See this:

data have;
input ColA $ ColB $ ColC $ Diff;
cards;
A B C 1
A B C 2
A B C 4
D E F 1
D E F 3
D E F 4
;
run;

data want (drop=ldif);
set have;
by cola colb colc;
ldif = lag(diff);
if not first.colc then do diff = ldif + 1 to diff;
  output;
end;
else output;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Respected Advisor
Posts: 3,294

Re: Replace missing values with multiple values

It's tough to generalize from your example.

 

Do you always want 4 rows? Or do you want the number of rows to be flexible based on the input data?

--
Paige Miller
New Contributor
Posts: 3

Re: Replace missing values with multiple values

Posted in reply to PaigeMiller

Flexible ..the length could go on for a while.  I wanted to show a simple example.  I have many missing rows and need to fill them in with the information from the previous row.

Solution
‎02-01-2018 12:16 PM
Super User
Posts: 10,623

Re: Replace missing values with multiple values

See this:

data have;
input ColA $ ColB $ ColC $ Diff;
cards;
A B C 1
A B C 2
A B C 4
D E F 1
D E F 3
D E F 4
;
run;

data want (drop=ldif);
set have;
by cola colb colc;
ldif = lag(diff);
if not first.colc then do diff = ldif + 1 to diff;
  output;
end;
else output;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 6,939

Re: Replace missing values with multiple values

I think this is what you are asking for:

 

data want;

set have;

prior_a = lag(a);

prior_b = lag(b);

prior_c = lag(c);

skip = dif(diff);

if skip <= 1 then output;

else do;

   temp_a = a;

   temp_b = b;

   temp_c = c;

   a = prior_a;

   b = prior_b;

   c = prior_c;

   do diff = (diff - skip + 1) to (diff - 1);

      output;

   end;

   a = temp_a;

   b = temp_b;

   c = temp_c;

   output;

end;

drop prior_: temp_: skip;

run;

 

It's untested, so if it needs a little tweaking just ask.

Super User
Posts: 2,078

Re: Replace missing values with multiple values

@Jarious What about the possibility of :

 

I have:

ColA  ColB  ColC  Diff

A           B       C      1

A           B       C      2

A           B       C      4

D           E       F      1

D           E       F      3

D           E       F       4

T           U       V       1 /*is this possible ?*/

T           U       V       5/*is this possible ?*/

 

Super User
Posts: 2,078

Re: Replace missing values with multiple values

Posted in reply to novinosrin
data have;
input ColA $ ColB $ ColC $ Diff;
cards;
A B C 1
A B C 2
A B C 4
D E F 1
D E F 3
D E F 4
T U V 1 
T U V 5
;
run;

proc sort data=have out=_have ;
by ColA ColB ColC descending diff;
run;

data want;
set _have;
by ColA ColB ColC;
if first.ColA and first.Colb  and first.Colc then  do diff=1 to diff;
output;
end;
else return;
run;
Super User
Posts: 10,860

Re: Replace missing values with multiple values

data have;
input ColA $ ColB $ ColC $ Diff;
cards;
A B C 1
A B C 2
A B C 4
D E F 1
D E F 3
D E F 4
;
run;
data want;
 merge have have(rename=(cola=_cola colb=_colb colc=_colc diff=_diff) firstobs=2);
 output;
 if cola=_cola and colb=_colb and colc=_colc  then do;
  do i=diff+1 to _diff-1;
   diff=i;output;
  end;
 end;
 drop _: i;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 173 views
  • 1 like
  • 6 in conversation