Hello,
I have this data set of weight and height measurements. Sometimes either the height or the weight is missing.
when on one the measurement is missing I want to pick the non missing measurement the closest date and that is in a 3 months window before or after the measurement date.
Thank you
data have;
infile datalines missover;
input id weight height vsdt;
format vsdt yymmdd10.;
datalines;
001 65 160 09MAR2017
001 65 . 06JUN2017
001 65 160 18JUL2017
002 55 158 13MAY2014
002 58 . 14SEP2014
002 58 158 15SEP2015
003 70 180 06AUG2017
003 70.5 01MAR2018
003 71 01FEB2019
run;
Try this one :
data have;
infile datalines missover;
input id weight height vsdt:date10.;
format vsdt yymmdd10.;
datalines;
001 65 160 09MAR2017
001 65 . 06JUN2017
001 65 160 18JUL2017
002 55 158 13MAY2014
002 58 . 14SEP2014
002 58 158 15SEP2015
003 70 180 06AUG2017
003 70.5 . 01MAR2018
003 71 . 01FEB2019
;
run;
proc sql;
select *,case when missing(height) then
(select min(height) from have where id=a.id and
vsdt between intnx('month',a.vsdt,-3,'s') and intnx('month',a.vsdt,3,'s') and
height is not missing
having abs(a.vsdt-vsdt) = min(abs(a.vsdt-vsdt))
)
else height end as new_height
from have as a;
quit;
why don't we retain
data have;
infile datalines missover;
input id weight height vsdt:date10.;
format vsdt yymmdd10.;
datalines;
001 65 160 09MAR2017
001 65 . 06JUN2017
001 65 160 18JUL2017
002 55 158 13MAY2014
002 58 . 14SEP2014
002 58 158 15SEP2015
003 70 180 06AUG2017
003 70.5 . 01MAR2018
003 71 . 01FEB2019
;
run;
proc sort data=have;
by id vsdt;
run;
data want;
set have;
by id vsdt;
retain height_;
if first.id then height_=.;
if height ne . then height_=height;
run;
The retain would not work because we want the closest measurement within a 3 week window so for this patient;
001 65 160 09MAR2017
001 65 . 06JUN2017
001 65 160 18JUL2017
for the record where height is missing (06JUN2017) we would want to pick the measurement on 18JUL2017 and not the one from 09MAR2017 because the 18JUL2017 is closer to 06JUNE2017 than 09MAR2017
Try this one :
data have;
infile datalines missover;
input id weight height vsdt:date10.;
format vsdt yymmdd10.;
datalines;
001 65 160 09MAR2017
001 65 . 06JUN2017
001 65 160 18JUL2017
002 55 158 13MAY2014
002 58 . 14SEP2014
002 58 158 15SEP2015
003 70 180 06AUG2017
003 70.5 . 01MAR2018
003 71 . 01FEB2019
;
run;
proc sql;
select *,case when missing(height) then
(select min(height) from have where id=a.id and
vsdt between intnx('month',a.vsdt,-3,'s') and intnx('month',a.vsdt,3,'s') and
height is not missing
having abs(a.vsdt-vsdt) = min(abs(a.vsdt-vsdt))
)
else height end as new_height
from have as a;
quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.