There seems to be an error since the only three values of birthmonth from the output dataset are 2, 3, and 12. I think this has to do with the fact that the code uses lag2 instead of all possible values of lag like lag3, lag4, etc.
Focusing on scenario 1 and 2 only for the moment, I'm trying the following code that manually compares every age to January age; if they are not equivalent, birth month is set to that month and then the loop should exit. However, this code results in an infinite loop.
data want; set have;
by id; retain _jan_age;
if month = 1 then _jan_age = age;
birthmonth = .;
do while(birthmonth = .);
if age NE _jan_age then birthmonth = month - 1; if month = 12 and age = _jan_age then birthmonth = 12; end;
run;
... View more