DATA Step, Macro, Functions and more

Dynamic Imputation

Reply
Occasional Contributor
Posts: 17

Dynamic Imputation

Hi,

I have the data as follows. Suppose it has 2 variables X and Y and the values are as below

X Y
A 2
4
6
8
10
B 1
2
C 1
2
3
4

and something like that....The missing value for variable X from row 2 to row 5 is A and the missing value for variable X in row 7 is B and the missing value for variable X from row 9 to 12 is C. What should I do to fill in the appropriate value in place of the missing values. The above dataset is just an example. In my actual data I have 89 different values for variable X and the number of missing values under it keep on changing.

Can someone please let me know what is an effective way to solve this problem ( the code has to be dynamic). Thanks in advance!

Thanks
Sohail Mohammad
Super Contributor
Super Contributor
Posts: 3,174

Re: Dynamic Imputation

Using a DATA step and a DO UNTIL(EOF); with an INFILE and END=EOF coded, you can input the data record and detect when you have one or two field values. When you have one value, do not overlay "X", only overlap/replace "Y", and only perform your OUTPUT when you have the correct data/variable condition. Some functions you will consider using are SCAN and also INPUT when assigning your true "X" and "Y" variables. If you iterate your DATA step with the DO/END and having the INPUT within the code portion, you then do not need to use the RETAIN statement since SAS will not return to the top of the DATA step.

Suggest you track your program with some well-placed PUTLOG '>diag-nnn>' / _ALL_; commands within your DO / END programming loop.


Scott Barry
SBBWorks, Inc.


Recommended Google advanced search arguments, this topic/post:

data step programming external data site:sas.com

data step parse external data input function site:sas.com

data step parse external data _infile_ site:sas.com
N/A
Posts: 0

Re: Dynamic Imputation

It's a simple data step.

data XXXX (drop = XorYisvoid voidValue);
set YourFile;

If x eq '' then XorTisvoid = 'X';
else XorTisvoid = 'y';

if XorTisvoid = 'X' then do;
if voidvalue ne X then voidvalue =x;
Y = voidvalue;
end;
else do;
if voidvalue ne Y then voidvalue =y;
Y = voidvalue;
end;

run;

I programed it by memory but i think it's a fine base to start
Occasional Contributor
Posts: 17

Re: Dynamic Imputation

I don't know why the data was posted that way...I dint mean to show up that way.

Actually the missing values are for X and the numeric values that show up under variable X is for variable Y.

So, the data is liek this:
Variable X: 'A' '' '' '' '' 'B' '' 'C' ''' ''

and the corresponding values for variable Y are: 2 4 6 8 10 1 2 1 2 3 4

All the missing values between 'A' and 'B' should be imputed with 'A' and all the missing values between 'B' and 'C' should be imputed with 'B' and so on....

@sasvage: What are the variables you are dropping in your data step option? Can you please elaborate?

Thanks
Sohail Mohammad
Respected Advisor
Posts: 3,894

Re: Dynamic Imputation

Soha

It might be worth to consult the excellent SAS Docu: http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000146292.htm

From your example data given I assume that column input would solve your problem:
(I had to add this 'x' in order to post a working example).

data have;
infile datalines truncover;
input @3 x $ @5 y;
datalines;
x A 2
x 4
x 6
x 8
x 10
x B 1
x 2
x C 1
x 2
x 3
x 4
;
proc print data=have;
run;
N/A
Posts: 0

Re: Dynamic Imputation

I explain,

XorYisvoid= is for knowing that X or Y was void, but i didn't undestand the behavior of your data.
voidValue = for retaining the last value.

So you will need a do loop in scl, and some macrovariables. 89 variables is too much for programing with simple if.
The next code preteds to collect your variables names, and after using a do loop to fill 1 to 1 the variables, B to A, C to B ...
But i have a cuestion, the 89th var have to refill the other 88 if all are null? then you have to start from 89 to 1.

First of all you have to make a data step with variables, something like:
VarData
_N_ VarName
1 Name, (we call it A)
2 State, (we call it B)
3 ...

do i=1 to dim (Vardata-1)
call simput('Myindex',i);
submit continue;
data _NULL_;
if _N_ eq &MyIndex Then call simput ('ValueThatIhaveToFill',VarName);
if _N_ eq &MyIndex+1 Then call simput ('ValueThatIhaveToPass',VarName);
run;
data TheDataiHave;
If &ValueThatIhaveToFill eq '' then &ValueThatIhaveToFill (name, state) = &ValueThatIhaveToPass;
run;
endsubmit;
end; (do loop)

This is not tested but i think it's basically that you want.
SAS Employee
Posts: 1

Re: Dynamic Imputation

Hi Soha,

Is your data an external file? maybe you can use @ functions:

data x(drop=z);
infile 'c:\xy.txt' dlm =' ' firstobs=2;
length x $8. y 8.;
input @1 x $ @;
retain z;
if indexc(x,'12345667890')=0 then do;
input @3 y;
z=x;
end;
else do;
x=z;
input @1 y;
end;
run;
Respected Advisor
Posts: 3,777

Re: Dynamic Imputation

Looks like data I sometimes get from EXCEL. I assume your starting point is a SAS data set.

[pre]
data have;
input x $ y @@;
cards;
A 2 . 4 . 6 . 8 . 10
B 1 . 2
C 1 . 2 . 3 . 4
;;;;
run;
data need;
if 0 then set have(keep=x);
set have(rename=(x=_x));
x = coalesceC(_x,x);
run;
proc print;
run;
[/pre]


You can drop _X

[pre]
Obs x _x y

1 A A 2
2 A 4
3 A 6
4 A 8
5 A 10
6 B B 1
7 B 2
8 C C 1
9 C 2
10 C 3
11 C 4
[/pre]
SAS Employee
Posts: 104

Re: Dynamic Imputation

You could do this with a RETAIN statement. Consider the following:

/* Create some data to test with */
data initial;
input x $ y;
datalines;
A 2
. 4
. 6
. 8
. 10
B 1
. 2
C 1
. 2
. 3
. 4
;
run;

/* Impute missing X values from previous valid values */
data imputed;
set initial;
retain LastX " ";
if X="" then X=LastX;
else LastX=X;
drop LastX;
run;
proc print data=imputed;
run;

/*** Results *******
Obs x y
1 A 2
2 A 4
3 A 6
4 A 8
5 A 10
6 B 1
7 B 2
8 C 1
9 C 2
10 C 3
11 C 4
***********************/
Ask a Question
Discussion stats
  • 8 replies
  • 234 views
  • 0 likes
  • 7 in conversation