I have a panel dataset that is unbalanced. There are not currently observations for every subject in every year. I would like to create a balanced dataset that has a row for every subject at every year, but have missing observations for the other variables where the data is currently missing. This is easy to do in STATA using the tsfill command, but I can't find a way to easily do this in SAS. I'm assuming I need to create some kind of loop in a data step, but I'm not sure how to apply the loop only to time periods that are currently missing. My file has about 100,000 records before expanding to the balanced panel, so I need to be able to do this programmatically and not looking at each individual case. data have;
infile datalines ;
input id:$8. Year:8. Department:$8. Program_Code:$8.;
datalines;
PersonA 2000 DeptA Prog20
PersonA 2001 DeptA Prog20
PersonA 2002 DeptA Prog20
PersonA 2003 DeptA Prog20
PersonB 2002 DeptB Prog15
PersonB 2003 DeptB Prog15
PersonB 2004 DeptB Prog15
PersonC 2001 DeptC Prog05
PersonC 2002 DeptC Prog05
PersonC 2003 DeptC Prog05
PersonC 2004 DeptD Prog25
PersonC 2005 DeptD Prog25
;
run;
data want;
infile datalines ;
input id:$8. Year:8. Department:$8. Program_Code:$8.;
datalines;
PersonA 2000 DeptA Prog20
PersonA 2001 DeptA Prog20
PersonA 2002 DeptA Prog20
PersonA 2003 DeptA Prog20
PersonA 2004 . .
PersonA 2005 . .
PersonB 2000 . .
PersonB 2001 . .
PersonB 2002 DeptB Prog15
PersonB 2003 DeptB Prog15
PersonB 2004 DeptB Prog15
PersonB 2005 . .
PersonC 2000 . .
PersonC 2001 DeptC Prog05
PersonC 2002 DeptC Prog05
PersonC 2003 DeptC Prog05
PersonC 2004 DeptD Prog25
PersonC 2005 DeptD Prog25
;
run; The post at https://communities.sas.com/t5/General-SAS-Programming/How-to-balance-the-quot-unbalanced-data-quot-and-transpose-the/td-p/172063 said this can be done with PROC TRANSPOSE, but I need the data in the long format, and not the wide format. I have tried PROC Expand, but because it is character data I haven't been able to get that to work.
... View more