I have a data set given below:
data have;
input x y z a b c;
datalines;
1 2 3 4 . .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
. . . . 5 6
run;
The two values of first row is showing in the last row and four values of first row are missing in the last row.
I want correct missing values in one row either first or last and remove the duplicate.
output like below.
x y z a b c
1 2 3 4 5 6. .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
Thaks in advance.
How do you know that you need to use first/last row to fix missing? Is it a single manual process?
Ok..Then do it for both Row..both first and last and place the correct values in that..I want only correct data..
data have;
input x y z a b c;
datalines;
1 2 3 4 . .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
. . . . 5 6
run;
data want;
set have end=last;
array arrayname _numeric_;
array arrayname2(*) x1 y2 z2 a2 b2 c2;
retain arrayname2;
if _n_=1 then do;
do i=1 to dim(arrayname);
arrayname2(i)=arrayname(i);
end;
end;
if last then do;
if b2=. then b2=b;
if c2=. then c2=c;
output;
end;
keep x1 y2 z2 a2 b2 c2;
run;
data want2;
set have;
if _n_=1 then set want;
array arrayname(*) x y z a b c;
array arrayname2(*) x1 y2 z2 a2 b2 c2;
do i=1 to dim(arrayname);
if arrayname(i)=. then arrayname(i)=arrayname2(i);
end;
keep x y z a b c;
run;
Is this what you want?
How do you know what's correct? Why isn't the missing filled in by the second row, why the last?
And why is the last filled in by the first? Or is it because they're missing opposite variables? Is there an ID variable to identify these rows?
If the business logic is simply for these two rows then do it manually
data have;
input x y z a b c;
datalines;
1 2 3 4 . .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
. . . . 5 6
;
run;
data want;
set have end=last;
if _n_=1 then do;
b=5; c=6;
end;
if not last then output;
run;
data want;
set have;
count=0;
array v _numeric_;
do over v;
if v=. then count+1;
end;
if count>0 then n=1;
else n=_n_;
run;
proc sql;
create table final as
select sum(x) as x,sum(y) as y, sum(z) as z, sum(a) as a,sum(b) as b, sum(c) as c
from want
group by n;
quit;
data have; input x y z a b c; datalines; 1 2 3 4 . . 5 8 9 1 2 3 1 4 7 8 5 6 4 5 6 7 8 9 . . . . 5 6 ; run; data want; set have end=last; if _n_ eq 1 then do; set have(rename=(x=_x y=_y z=_z a=_a b=_b c=_c)) nobs=nobs point=nobs; x=coalesce(x,_x); y=coalesce(y,_y); z=coalesce(z,_z); a=coalesce(a,_a); b=coalesce(b,_b); c=coalesce(c,_c); end; if not last; drop _:; run;
Xia Keshan
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.