09-19-2016 06:33 PM
data have;
input id var1 var2;
cards;
1 3 .
1 5 .
1 . 4
1 . 7
2 4 2
2 8 .
2 . 8
2 9 .
2 . .
3 2 3
3 7 .
3 . 9
3 . 6
4 . 2
5 . 1
5 9 2
6 . 5
6 . 3
run;
I was thinking to generate two new variables, e.g., variable “count” to count by ID, and variable “total” represents total observations for each ID, use those two new variables as conditions for different missing value replacement. However, look-ahead replacement is out of my reach.
I also find some answers online, but not exactly what I want, like the following one is replacing all missing values with lag values.
data want;
set have;
n=_n_;
if missing(var1) then do;
do until (not missing(var1));
n=n-1;
set have(keep= var1) point=n; *second SET statement;
end;
end;
run;
Desired output:
Thanks!
09-19-2016 10:25 PM - edited 09-19-2016 11:59 PM
Double DO UNTIL() to the rescue:
data have;
input id var1 var2;
cards;
1 3 .
1 5 .
1 . 4
1 . 7
2 4 2
2 8 .
2 . 8
2 9 .
2 . .
3 2 3
3 7 .
3 . 9
3 . 6
4 . 2
5 . 1
5 9 2
6 . 5
6 . 3
;
data want;
array firstExist{2};
array lastExist{2};
array var{2};
do until(last.id);
set have; by id;
do i = 1 to dim(var);
if missing(firstExist{i}) then firstExist{i} = var{i};
end;
end;
do until(last.id);
set have; by id;
do i = 1 to dim(var);
if missing(var{i}) then var{i} = coalesce(lastExist{i}, firstExist{i}, 0);
else lastExist{i} = var{i};
end;
output;
end;
keep id var:;
run;
proc print data=want noobs; var id var1 var2; run;
09-19-2016 10:25 PM - edited 09-19-2016 11:59 PM
Double DO UNTIL() to the rescue:
data have;
input id var1 var2;
cards;
1 3 .
1 5 .
1 . 4
1 . 7
2 4 2
2 8 .
2 . 8
2 9 .
2 . .
3 2 3
3 7 .
3 . 9
3 . 6
4 . 2
5 . 1
5 9 2
6 . 5
6 . 3
;
data want;
array firstExist{2};
array lastExist{2};
array var{2};
do until(last.id);
set have; by id;
do i = 1 to dim(var);
if missing(firstExist{i}) then firstExist{i} = var{i};
end;
end;
do until(last.id);
set have; by id;
do i = 1 to dim(var);
if missing(var{i}) then var{i} = coalesce(lastExist{i}, firstExist{i}, 0);
else lastExist{i} = var{i};
end;
output;
end;
keep id var:;
run;
proc print data=want noobs; var id var1 var2; run;
09-19-2016 11:00 PM
data have; input id var1 var2; n+1; cards; 1 3 . 1 5 . 1 . 4 1 . 7 2 4 2 2 8 . 2 . 8 2 9 . 2 . . 3 2 3 3 7 . 3 . 9 3 . 6 4 . 2 5 . 1 5 9 2 6 . 5 6 . 3 ; run; data temp; set have; by id; retain v1 v2; if first.id then do;v1=.;v2=.;end; if not missing(var1) then v1=var1; if not missing(var2) then v2=var2; drop var1 var2; run; proc sort data=temp;by descending n;run; data temp; set temp; by id notsorted; retain var1 var2; if first.id then do;var1=.;var2=.;end; if not missing(v1) then var1=v1; if not missing(v2) then var2=v2; drop v1 v2; run; proc sort data=temp;by n;run; proc stdize data=temp out=want(drop=n) reponly missing=0; var var1 var2; run;
09-20-2016 05:47 PM
Hi @Ksharp, I think the better way to read a dataset backwards is with the point= option, as in:
data ssalc / view=ssalc;
do i = nobs to 1 by -1;
set sashelp.class nobs=nobs point=i;
output;
end;
stop;
run;
proc print data=ssalc; run;
It doesn't require the creation of an order variable (your n) and should be faster than a sort.
09-21-2016 12:48 AM
09-20-2016 04:48 PM
Thanks a lot PG and Ksharp! Both work great! I would like to choose both as solutions; but I can only choose one, so I choose the first response.
Need further help from the community? Please ask a new question.