turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Dynamic Imputation

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-11-2010 05:31 PM

Hi,

I have the data as follows. Suppose it has 2 variables X and Y and the values are as below

X Y

A 2

4

6

8

10

B 1

2

C 1

2

3

4

and something like that....The missing value for variable X from row 2 to row 5 is A and the missing value for variable X in row 7 is B and the missing value for variable X from row 9 to 12 is C. What should I do to fill in the appropriate value in place of the missing values. The above dataset is just an example. In my actual data I have 89 different values for variable X and the number of missing values under it keep on changing.

Can someone please let me know what is an effective way to solve this problem ( the code has to be dynamic). Thanks in advance!

Thanks

Sohail Mohammad

I have the data as follows. Suppose it has 2 variables X and Y and the values are as below

X Y

A 2

4

6

8

10

B 1

2

C 1

2

3

4

and something like that....The missing value for variable X from row 2 to row 5 is A and the missing value for variable X in row 7 is B and the missing value for variable X from row 9 to 12 is C. What should I do to fill in the appropriate value in place of the missing values. The above dataset is just an example. In my actual data I have 89 different values for variable X and the number of missing values under it keep on changing.

Can someone please let me know what is an effective way to solve this problem ( the code has to be dynamic). Thanks in advance!

Thanks

Sohail Mohammad

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-11-2010 06:24 PM

Using a DATA step and a DO UNTIL(EOF); with an INFILE and END=EOF coded, you can input the data record and detect when you have one or two field values. When you have one value, do not overlay "X", only overlap/replace "Y", and only perform your OUTPUT when you have the correct data/variable condition. Some functions you will consider using are SCAN and also INPUT when assigning your true "X" and "Y" variables. If you iterate your DATA step with the DO/END and having the INPUT within the code portion, you then do not need to use the RETAIN statement since SAS will not return to the top of the DATA step.

Suggest you track your program with some well-placed PUTLOG '>diag-nnn>' / _ALL_; commands within your DO / END programming loop.

Scott Barry

SBBWorks, Inc.

Recommended Google advanced search arguments, this topic/post:

data step programming external data site:sas.com

data step parse external data input function site:sas.com

data step parse external data _infile_ site:sas.com

Suggest you track your program with some well-placed PUTLOG '>diag-nnn>' / _ALL_; commands within your DO / END programming loop.

Scott Barry

SBBWorks, Inc.

Recommended Google advanced search arguments, this topic/post:

data step programming external data site:sas.com

data step parse external data input function site:sas.com

data step parse external data _infile_ site:sas.com

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-12-2010 07:43 AM

It's a simple data step.

data XXXX (drop = XorYisvoid voidValue);

set YourFile;

If x eq '' then XorTisvoid = 'X';

else XorTisvoid = 'y';

if XorTisvoid = 'X' then do;

if voidvalue ne X then voidvalue =x;

Y = voidvalue;

end;

else do;

if voidvalue ne Y then voidvalue =y;

Y = voidvalue;

end;

run;

I programed it by memory but i think it's a fine base to start

data XXXX (drop = XorYisvoid voidValue);

set YourFile;

If x eq '' then XorTisvoid = 'X';

else XorTisvoid = 'y';

if XorTisvoid = 'X' then do;

if voidvalue ne X then voidvalue =x;

Y = voidvalue;

end;

else do;

if voidvalue ne Y then voidvalue =y;

Y = voidvalue;

end;

run;

I programed it by memory but i think it's a fine base to start

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-12-2010 05:19 PM

I don't know why the data was posted that way...I dint mean to show up that way.

Actually the missing values are for X and the numeric values that show up under variable X is for variable Y.

So, the data is liek this:

Variable X: 'A' '' '' '' '' 'B' '' 'C' ''' ''

and the corresponding values for variable Y are: 2 4 6 8 10 1 2 1 2 3 4

All the missing values between 'A' and 'B' should be imputed with 'A' and all the missing values between 'B' and 'C' should be imputed with 'B' and so on....

@sasvage: What are the variables you are dropping in your data step option? Can you please elaborate?

Thanks

Sohail Mohammad

Actually the missing values are for X and the numeric values that show up under variable X is for variable Y.

So, the data is liek this:

Variable X: 'A' '

and the corresponding values for variable Y are: 2 4 6 8 10 1 2 1 2 3 4

All the missing values between 'A' and 'B' should be imputed with 'A' and all the missing values between 'B' and 'C' should be imputed with 'B' and so on....

@sasvage: What are the variables you are dropping in your data step option? Can you please elaborate?

Thanks

Sohail Mohammad

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-13-2010 01:06 AM

Soha

It might be worth to consult the excellent SAS Docu: http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000146292.htm

From your example data given I assume that column input would solve your problem:

(I had to add this 'x' in order to post a working example).

data have;

infile datalines truncover;

input @3 x $ @5 y;

datalines;

x A 2

x 4

x 6

x 8

x 10

x B 1

x 2

x C 1

x 2

x 3

x 4

;

proc print data=have;

run;

It might be worth to consult the excellent SAS Docu: http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000146292.htm

From your example data given I assume that column input would solve your problem:

(I had to add this 'x' in order to post a working example).

data have;

infile datalines truncover;

input @3 x $ @5 y;

datalines;

x A 2

x 4

x 6

x 8

x 10

x B 1

x 2

x C 1

x 2

x 3

x 4

;

proc print data=have;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-15-2010 04:56 AM

I explain,

XorYisvoid= is for knowing that X or Y was void, but i didn't undestand the behavior of your data.

voidValue = for retaining the last value.

So you will need a do loop in scl, and some macrovariables. 89 variables is too much for programing with simple if.

The next code preteds to collect your variables names, and after using a do loop to fill 1 to 1 the variables, B to A, C to B ...

But i have a cuestion, the 89th var have to refill the other 88 if all are null? then you have to start from 89 to 1.

First of all you have to make a data step with variables, something like:

VarData

_N_ VarName

1 Name, (we call it A)

2 State, (we call it B)

3 ...

do i=1 to dim (Vardata-1)

call simput('Myindex',i);

submit continue;

data _NULL_;

if _N_ eq &MyIndex Then call simput ('ValueThatIhaveToFill',VarName);

if _N_ eq &MyIndex+1 Then call simput ('ValueThatIhaveToPass',VarName);

run;

data TheDataiHave;

If &ValueThatIhaveToFill eq '' then &ValueThatIhaveToFill (name, state) = &ValueThatIhaveToPass;

run;

endsubmit;

end; (do loop)

This is not tested but i think it's basically that you want.

XorYisvoid= is for knowing that X or Y was void, but i didn't undestand the behavior of your data.

voidValue = for retaining the last value.

So you will need a do loop in scl, and some macrovariables. 89 variables is too much for programing with simple if.

The next code preteds to collect your variables names, and after using a do loop to fill 1 to 1 the variables, B to A, C to B ...

But i have a cuestion, the 89th var have to refill the other 88 if all are null? then you have to start from 89 to 1.

First of all you have to make a data step with variables, something like:

VarData

_N_ VarName

1 Name, (we call it A)

2 State, (we call it B)

3 ...

do i=1 to dim (Vardata-1)

call simput('Myindex',i);

submit continue;

data _NULL_;

if _N_ eq &MyIndex Then call simput ('ValueThatIhaveToFill',VarName);

if _N_ eq &MyIndex+1 Then call simput ('ValueThatIhaveToPass',VarName);

run;

data TheDataiHave;

If &ValueThatIhaveToFill eq '' then &ValueThatIhaveToFill (name, state) = &ValueThatIhaveToPass;

run;

endsubmit;

end; (do loop)

This is not tested but i think it's basically that you want.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-15-2010 12:31 AM

Hi Soha,

Is your data an external file? maybe you can use @ functions:

data x(drop=z);

infile 'c:\xy.txt' dlm =' ' firstobs=2;

length x $8. y 8.;

input @1 x $ @;

retain z;

if indexc(x,'12345667890')=0 then do;

input @3 y;

z=x;

end;

else do;

x=z;

input @1 y;

end;

run;

Is your data an external file? maybe you can use @ functions:

data x(drop=z);

infile 'c:\xy.txt' dlm =' ' firstobs=2;

length x $8. y 8.;

input @1 x $ @;

retain z;

if indexc(x,'12345667890')=0 then do;

input @3 y;

z=x;

end;

else do;

x=z;

input @1 y;

end;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-15-2010 02:07 PM

Looks like data I sometimes get from EXCEL. I assume your starting point is a SAS data set.

[pre]

data have;

input x $ y @@;

cards;

A 2 . 4 . 6 . 8 . 10

B 1 . 2

C 1 . 2 . 3 . 4

;;;;

run;

data need;

if 0 then set have(keep=x);

set have(rename=(x=_x));

x = coalesceC(_x,x);

run;

proc print;

run;

[/pre]

You can drop _X

[pre]

Obs x _x y

1 A A 2

2 A 4

3 A 6

4 A 8

5 A 10

6 B B 1

7 B 2

8 C C 1

9 C 2

10 C 3

11 C 4

[/pre]

[pre]

data have;

input x $ y @@;

cards;

A 2 . 4 . 6 . 8 . 10

B 1 . 2

C 1 . 2 . 3 . 4

;;;;

run;

data need;

if 0 then set have(keep=x);

set have(rename=(x=_x));

x = coalesceC(_x,x);

run;

proc print;

run;

[/pre]

You can drop _X

[pre]

Obs x _x y

1 A A 2

2 A 4

3 A 6

4 A 8

5 A 10

6 B B 1

7 B 2

8 C C 1

9 C 2

10 C 3

11 C 4

[/pre]

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Soha

03-17-2010 02:19 PM

You could do this with a RETAIN statement. Consider the following:

/* Create some data to test with */

data initial;

input x $ y;

datalines;

A 2

. 4

. 6

. 8

. 10

B 1

. 2

C 1

. 2

. 3

. 4

;

run;

/* Impute missing X values from previous valid values */

data imputed;

set initial;

retain LastX " ";

if X="" then X=LastX;

else LastX=X;

drop LastX;

run;

proc print data=imputed;

run;

/*** Results *******

Obs x y

1 A 2

2 A 4

3 A 6

4 A 8

5 A 10

6 B 1

7 B 2

8 C 1

9 C 2

10 C 3

11 C 4

***********************/

/* Create some data to test with */

data initial;

input x $ y;

datalines;

A 2

. 4

. 6

. 8

. 10

B 1

. 2

C 1

. 2

. 3

. 4

;

run;

/* Impute missing X values from previous valid values */

data imputed;

set initial;

retain LastX " ";

if X="" then X=LastX;

else LastX=X;

drop LastX;

run;

proc print data=imputed;

run;

/*** Results *******

Obs x y

1 A 2

2 A 4

3 A 6

4 A 8

5 A 10

6 B 1

7 B 2

8 C 1

9 C 2

10 C 3

11 C 4

***********************/