BookmarkSubscribeRSS Feed
Soha
Calcite | Level 5
Hi,

I have the data as follows. Suppose it has 2 variables X and Y and the values are as below

X Y
A 2
4
6
8
10
B 1
2
C 1
2
3
4

and something like that....The missing value for variable X from row 2 to row 5 is A and the missing value for variable X in row 7 is B and the missing value for variable X from row 9 to 12 is C. What should I do to fill in the appropriate value in place of the missing values. The above dataset is just an example. In my actual data I have 89 different values for variable X and the number of missing values under it keep on changing.

Can someone please let me know what is an effective way to solve this problem ( the code has to be dynamic). Thanks in advance!

Thanks
Sohail Mohammad
8 REPLIES 8
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Using a DATA step and a DO UNTIL(EOF); with an INFILE and END=EOF coded, you can input the data record and detect when you have one or two field values. When you have one value, do not overlay "X", only overlap/replace "Y", and only perform your OUTPUT when you have the correct data/variable condition. Some functions you will consider using are SCAN and also INPUT when assigning your true "X" and "Y" variables. If you iterate your DATA step with the DO/END and having the INPUT within the code portion, you then do not need to use the RETAIN statement since SAS will not return to the top of the DATA step.

Suggest you track your program with some well-placed PUTLOG '>diag-nnn>' / _ALL_; commands within your DO / END programming loop.


Scott Barry
SBBWorks, Inc.


Recommended Google advanced search arguments, this topic/post:

data step programming external data site:sas.com

data step parse external data input function site:sas.com

data step parse external data _infile_ site:sas.com
deleted_user
Not applicable
It's a simple data step.

data XXXX (drop = XorYisvoid voidValue);
set YourFile;

If x eq '' then XorTisvoid = 'X';
else XorTisvoid = 'y';

if XorTisvoid = 'X' then do;
if voidvalue ne X then voidvalue =x;
Y = voidvalue;
end;
else do;
if voidvalue ne Y then voidvalue =y;
Y = voidvalue;
end;

run;

I programed it by memory but i think it's a fine base to start
Soha
Calcite | Level 5
I don't know why the data was posted that way...I dint mean to show up that way.

Actually the missing values are for X and the numeric values that show up under variable X is for variable Y.

So, the data is liek this:
Variable X: 'A' '' '' '' '' 'B' '' 'C' ''' ''

and the corresponding values for variable Y are: 2 4 6 8 10 1 2 1 2 3 4

All the missing values between 'A' and 'B' should be imputed with 'A' and all the missing values between 'B' and 'C' should be imputed with 'B' and so on....

@sasvage: What are the variables you are dropping in your data step option? Can you please elaborate?

Thanks
Sohail Mohammad
Patrick
Opal | Level 21
Soha

It might be worth to consult the excellent SAS Docu: http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000146292.htm

From your example data given I assume that column input would solve your problem:
(I had to add this 'x' in order to post a working example).

data have;
infile datalines truncover;
input @3 x $ @5 y;
datalines;
x A 2
x 4
x 6
x 8
x 10
x B 1
x 2
x C 1
x 2
x 3
x 4
;
proc print data=have;
run;
deleted_user
Not applicable
I explain,

XorYisvoid= is for knowing that X or Y was void, but i didn't undestand the behavior of your data.
voidValue = for retaining the last value.

So you will need a do loop in scl, and some macrovariables. 89 variables is too much for programing with simple if.
The next code preteds to collect your variables names, and after using a do loop to fill 1 to 1 the variables, B to A, C to B ...
But i have a cuestion, the 89th var have to refill the other 88 if all are null? then you have to start from 89 to 1.

First of all you have to make a data step with variables, something like:
VarData
_N_ VarName
1 Name, (we call it A)
2 State, (we call it B)
3 ...

do i=1 to dim (Vardata-1)
call simput('Myindex',i);
submit continue;
data _NULL_;
if _N_ eq &MyIndex Then call simput ('ValueThatIhaveToFill',VarName);
if _N_ eq &MyIndex+1 Then call simput ('ValueThatIhaveToPass',VarName);
run;
data TheDataiHave;
If &ValueThatIhaveToFill eq '' then &ValueThatIhaveToFill (name, state) = &ValueThatIhaveToPass;
run;
endsubmit;
end; (do loop)

This is not tested but i think it's basically that you want.
fafajoe
SAS Employee
Hi Soha,

Is your data an external file? maybe you can use @ functions:

data x(drop=z);
infile 'c:\xy.txt' dlm =' ' firstobs=2;
length x $8. y 8.;
input @1 x $ @;
retain z;
if indexc(x,'12345667890')=0 then do;
input @3 y;
z=x;
end;
else do;
x=z;
input @1 y;
end;
run;
data_null__
Jade | Level 19
Looks like data I sometimes get from EXCEL. I assume your starting point is a SAS data set.

[pre]
data have;
input x $ y @@;
cards;
A 2 . 4 . 6 . 8 . 10
B 1 . 2
C 1 . 2 . 3 . 4
;;;;
run;
data need;
if 0 then set have(keep=x);
set have(rename=(x=_x));
x = coalesceC(_x,x);
run;
proc print;
run;
[/pre]


You can drop _X

[pre]
Obs x _x y

1 A A 2
2 A 4
3 A 6
4 A 8
5 A 10
6 B B 1
7 B 2
8 C C 1
9 C 2
10 C 3
11 C 4
[/pre]
SASJedi
SAS Super FREQ
You could do this with a RETAIN statement. Consider the following:

/* Create some data to test with */
data initial;
input x $ y;
datalines;
A 2
. 4
. 6
. 8
. 10
B 1
. 2
C 1
. 2
. 3
. 4
;
run;

/* Impute missing X values from previous valid values */
data imputed;
set initial;
retain LastX " ";
if X="" then X=LastX;
else LastX=X;
drop LastX;
run;
proc print data=imputed;
run;

/*** Results *******
Obs x y
1 A 2
2 A 4
3 A 6
4 A 8
5 A 10
6 B 1
7 B 2
8 C 1
9 C 2
10 C 3
11 C 4
***********************/
Check out my Jedi SAS Tricks for SAS Users

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 957 views
  • 0 likes
  • 7 in conversation