Hello, Let's say I have this dataset : data have;
input id $ var1 var2 var3 maxvar;
datalines;
id1 . . 999 3
id2 . 10 10 3 id3 10 . . 1
;
run; The . are what I called missing values. I want that, whenever I finda missing value in var1-var3, I go and take the first non-missing value to the right and split it by the amount of columns I went through to find this value. To help myself doing so, I implemented a column that gives the column in which I can find the last non missing value : maxvar If maxvar=2, it means that var2 is the last non missing value of all vars (so var3, var4,... will be . ) I guess it will be more clear with an example : using the data set I typed above, here is the result that I would like : id var1 var2 var3 maxvar id1 333 333 333 3 id2 5 5 10 3 id3 10 . . 1 I know that witch this example I could do something like "if var1=. and var2<>. then var1=var2/2 and var2=var2/2" etc, but the problem is that the number of var colums is variable and can go up to 100 colums ! I thought of using array statement and lag function in a data step, but I'm not sure how to manage the variable amount of columns I'll have to go through. Don't hesitate to ask if you want me to be more precise. Thanks !
... View more