Hi Astounding! You're code is very helpful! It gets me very close to the solution I need, but I need some more help to tweak it further to address the complexity of the missing values. I adapted your code to work with my data set to this: DATA HGCB3 set HGCBdata; array HGC {1979:1988} HGC1979 - HGC1988; limit = mod(CYRB, 10000); if (1979 <= limit) then do _n_ = 1979 to min(limit, 1988); HGCB = max(hgcb, HGC{limit}); end; drop limit; RUN; An example of the data set produced by the above code is: ID CYRB HGC1979 HGC1980 HGC1981 HGC1982 HGC1983 HGC1984 HGC1986 HGC1988 HGCB 01 1979 12 12 12 12 12 12 12 12 12 02 1981 9 10 11 12 12 12 12 12 11 03 1976 8 9 10 10 10 10 . . . 04 1984 10 11 12 . . . . 12 . 05 1987 . . 13 13 13 13 13 13 . n 1973 . 11 12 13 14 14 14 14 . However, I would like to address the missing values so that the data set produced looks like this: ID CYRB HGC1979 HGC1980 HGC1981 HGC1982 HGC1983 HGC1984 HGC1986 HGC1988 HGCB 01 1979 12 12 12 12 12 12 12 12 12 02 1981 9 10 11 12 12 12 12 12 11 03 1976 8 9 10 10 10 10 . . 8* 04 1984 10 11 12 . . . . 14 12* 05 1987 . . 13 13 13 13 13 13 13* n 1973 . 11 12 13 14 14 14 14 11* As you can see from the HGCB data points labeled with an *, I would like to avoid having missing values and instead, populate HGCB with the closest previous non-missing HGCyear value. Also of importance, (I was mistaken about this in my previous request) there are no participants born after 1988, meaning that there are no CYBR values >1988. However there are participants born before 1979, meaning that there are a number of participants with CYRB value <1979. Thus, the IF statement (which is currently if (1979 <= limit) then do _n_ = 1979 to min(limit, 2014);) needs to be updated, and I am not sure how to do that. Thank you again for your help! AMIHIC
... View more