Filling up missing values in panel data

Reply
Occasional Contributor MCB
Occasional Contributor
Posts: 15

Filling up missing values in panel data

Hello,

I have this panel dataset:

IDYearAgeATPWanted_ATP
1199957

5.2

5.2

12000585.05.0
12001596.26.2
1200260.6.2
1200361.6.2
1200462.6.2
1200563.6.2
1200664.6.2
21999615.45.4
22000623.23.2
22001635.05.0
2200264.5.0

2

200365..

I would like to create the
column "Wanted_ATP" so the last information I have on ATP (year 2001)
will be used for the rest of the years (from 2002 and forward). A condition is
that the person should not be older than 64 (ID 2). In that case the ATP_wanted should
be missing. In the example above this happens in 2003 för ID 2.

Thank you!

/Marcela

Respected Advisor
Posts: 3,799

Re: Filling up missing values in panel data

Something like this perhaps.

data atp;
   infile cards dsd firstobs=2;
  
input id year age atp;
   cards;
ID,Year,Age,ATP,Wanted_ATP
1,1999,57, 5.2,5.2
1,2000,58,5.0,5.0
1,2001,59,6.2,6.2
1,2002,60,.,6.2
1,2003,61,.,6.2
1,2004,62,.,6.2
1,2005,63,.,6.2
1,2006,64,.,6.2
2,1999,61,5.4,5.4
2,2000,62,3.2,3.2
2,2001,63,5.0,5.0
2,2002,64,.,5.0
2,2003,65,.,.
;;;;
   run;
data locf;
   set atp;
   by id;
   if first.id then want=.;
  
retain want;
   want = coalesce(atp,want);
  
if age ge 65 then want=.;
  
run;
9-11-2014 5-28-53 AM.png
proc print;
  
run;
Occasional Contributor MCB
Occasional Contributor
Posts: 15

Re: Filling up missing values in panel data

Posted in reply to data_null__

Thank you!

This works fine!

Contributor
Posts: 25

Re: Filling up missing values in panel data

This seems like a pretty complex missing data question that very much requires an in-depth understanding of the actual data and what you want to do with it. I would suggest consulting an experienced statistician, who ideally understands the specific field of research, on whether e.g. something like multiple imputation, extrapolation with some model or some other approach could be appropriate.

Trusted Advisor
Posts: 1,228

Re: Filling up missing values in panel data

proc sql;

create table want as

select a.id,year,age, a.atp,case when age<65 then coalesce(a.atp,b.atp)

else . end as wanted_atp

from atp a

left join (select id,atp from atp

where atp ne .

group by id

having age=max(age)) b on a.id=b.id

order by a.id,year;

quit;

Ask a Question
Discussion stats
  • 4 replies
  • 273 views
  • 3 likes
  • 4 in conversation