I have a data set given below:
data have;
input x y z a b c;
datalines;
1 2 3 4 . .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
. . . . 5 6
run;
The two values of first row is showing in the last row and four values of first row are missing in the last row.
I want correct missing values in one row either first or last and remove the duplicate.
output like below.
x y z a b c
1 2 3 4 5 6. .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
Thaks in advance.
How do you know that you need to use first/last row to fix missing? Is it a single manual process?
Ok..Then do it for both Row..both first and last and place the correct values in that..I want only correct data..
data have;
input x y z a b c;
datalines;
1 2 3 4 . .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
. . . . 5 6
run;
data want;
set have end=last;
array arrayname _numeric_;
array arrayname2(*) x1 y2 z2 a2 b2 c2;
retain arrayname2;
if _n_=1 then do;
do i=1 to dim(arrayname);
arrayname2(i)=arrayname(i);
end;
end;
if last then do;
if b2=. then b2=b;
if c2=. then c2=c;
output;
end;
keep x1 y2 z2 a2 b2 c2;
run;
data want2;
set have;
if _n_=1 then set want;
array arrayname(*) x y z a b c;
array arrayname2(*) x1 y2 z2 a2 b2 c2;
do i=1 to dim(arrayname);
if arrayname(i)=. then arrayname(i)=arrayname2(i);
end;
keep x y z a b c;
run;
Is this what you want?
How do you know what's correct? Why isn't the missing filled in by the second row, why the last?
And why is the last filled in by the first? Or is it because they're missing opposite variables? Is there an ID variable to identify these rows?
If the business logic is simply for these two rows then do it manually
data have;
input x y z a b c;
datalines;
1 2 3 4 . .
5 8 9 1 2 3
1 4 7 8 5 6
4 5 6 7 8 9
. . . . 5 6
;
run;
data want;
set have end=last;
if _n_=1 then do;
b=5; c=6;
end;
if not last then output;
run;
data want;
set have;
count=0;
array v _numeric_;
do over v;
if v=. then count+1;
end;
if count>0 then n=1;
else n=_n_;
run;
proc sql;
create table final as
select sum(x) as x,sum(y) as y, sum(z) as z, sum(a) as a,sum(b) as b, sum(c) as c
from want
group by n;
quit;
data have; input x y z a b c; datalines; 1 2 3 4 . . 5 8 9 1 2 3 1 4 7 8 5 6 4 5 6 7 8 9 . . . . 5 6 ; run; data want; set have end=last; if _n_ eq 1 then do; set have(rename=(x=_x y=_y z=_z a=_a b=_b c=_c)) nobs=nobs point=nobs; x=coalesce(x,_x); y=coalesce(y,_y); z=coalesce(z,_z); a=coalesce(a,_a); b=coalesce(b,_b); c=coalesce(c,_c); end; if not last; drop _:; run;
Xia Keshan
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.