The LAG() function sets up a queue, where an element is taken from the "top" and a new one inserted at the "bottom" whenever the function is called. So, calling the function conditionally is usually a bad idea, with confusing results.
But you call it unconditionally, which is good.
But at the time you call it for lag(category), category will be "A" if the first IF was not true, and that "A" goes into the queue.
Use RETAIN and BY instead.
data taggedData;
set work.untaggedData;
by id;
retain category;
if first.id then category = "A";
if not first.id and lag(year) - year >= 2 then category = "B";
run;
... View more