09-11-2014 05:47 AM
I have this panel dataset:
I would like to create the
column "Wanted_ATP" so the last information I have on ATP (year 2001)
will be used for the rest of the years (from 2002 and forward). A condition is
that the person should not be older than 64 (ID 2). In that case the ATP_wanted should
be missing. In the example above this happens in 2003 för ID 2.
09-11-2014 06:30 AM
Something like this perhaps.
09-11-2014 07:35 AM
This seems like a pretty complex missing data question that very much requires an in-depth understanding of the actual data and what you want to do with it. I would suggest consulting an experienced statistician, who ideally understands the specific field of research, on whether e.g. something like multiple imputation, extrapolation with some model or some other approach could be appropriate.
09-11-2014 08:04 AM
create table want as
select a.id,year,age, a.atp,case when age<65 then coalesce(a.atp,b.atp)
else . end as wanted_atp
from atp a
left join (select id,atp from atp
where atp ne .
group by id
having age=max(age)) b on a.id=b.id
order by a.id,year;