I’m doing analysis on a longitudinal study with up to 10 visits for some participants. Some data are missing. If data for a participant are not missing at a subsequent visit, then that subsequent visit data needs to replace the prior missing data. For example, participant A might have missed visit 1 but attended visits 2 through 10. I need the nonmissing data in vists 2-10 to concatenate left (or ‘shift’ to the left) so there are nonmissing data in vists 1-9 and one missing datum in visit 10. Of course, missing data could occur at any visit. All variable names have a suffix identifying visit number. Below is an example using numeric data (for example, positive or negative blood tests at each of ten visits); however, variables could be character or numeric. All help on this is appreciated - I wonder now if converting this wide data to long form is the first step...
Data Have
Obs x_v1 x_v2 x_v3 x_v4 x_v5 x_v6 x_v7 x_v8 x_v9 x_v10
1 . 1 0 1 0 1 0 1 0 1
2 0 . 1 0 0 . . 1 0 0
3 . 1 0 1 0 1 0 1 0 .
Data Want
Obs x_v1 x_v2 x_v3 x_v4 x_v5 x_v6 x_v7 x_v8 x_v9 x_v10
1 1 0 1 0 1 0 1 0 1 .
2 0 1 0 0 1 0 0 . . .
3 1 0 1 0 1 0 1 0 . .
data have;
input id x_v1 x_v2 x_v3 x_v4 x_v5 x_v6 x_v7 x_v8 x_v9 x_v10;
cards;
1 . 1 0 1 0 1 0 1 0 1
2 0 . 1 0 0 . . 1 0 0
3 . 1 0 1 0 1 0 1 0 .
;
data long(keep=id v);
set have;
array x(*) x_v1-x_v10;
do _n_=1 to dim(x);
v=x(_n_);
if not missing(v) then output;
end;
run;
data want(drop=v count);
retain x_v1-x_v10;
array x x_v1-x_v10;
set long;
by id notsorted;
if first.id then count=0;
count+1;
x(count)=v;
if last.id then do;
output;
call missing (of x(*));
end;
run;
proc print;run;
Linlin
data have;
input id x_v1 x_v2 x_v3 x_v4 x_v5 x_v6 x_v7 x_v8 x_v9 x_v10;
cards;
1 . 1 0 1 0 1 0 1 0 1
2 0 . 1 0 0 . . 1 0 0
3 . 1 0 1 0 1 0 1 0 .
;
data long(keep=id v);
set have;
array x(*) x_v1-x_v10;
do _n_=1 to dim(x);
v=x(_n_);
if not missing(v) then output;
end;
run;
data want(drop=v count);
retain x_v1-x_v10;
array x x_v1-x_v10;
set long;
by id notsorted;
if first.id then count=0;
count+1;
x(count)=v;
if last.id then do;
output;
call missing (of x(*));
end;
run;
proc print;run;
Linlin
Thanks Linlin! This worked quite well. Alan
Or you can try this:
data have;
input id x_v1 x_v2 x_v3 x_v4 x_v5 x_v6 x_v7 x_v8 x_v9 x_v10;
cards;
1 . 1 0 1 0 1 0 1 0 1
2 0 . 1 0 0 . . 1 0 0
3 . 1 0 1 0 1 0 1 0 .
;
data want (drop=_:);
set have;
length _t $50.;
array old x_v1-x_v10;
_t=catx(' ',of old(*));
_n=n(of old(*));
call missing(of old(*));
do _n_=1 to _n;
old(_n_)=scan(_t,_n_);
end;
run;
proc print;run;
Haikuo
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.