Hi,
I have a dataset that looks like the below:
id enc_type enc_date mi _date daysdiff bp_sys bp_dias ht wt bmi
3669 CL 6/8/2011 6/8/2011 0 126 60
3669 CL 6/15/2011 6/8/2011 7
3669 PR 6/17/2011 6/8/2011 9 136 58
3669 PR 6/23/2011 6/8/2011 15 112 58 175.26 195 28.78 -> want this
3695 IM 5/14/2014 5/14/2014 0 125 80
3695 IM 5/21/2014 5/14/2014 7 125 70 160.02 -> want this
3744 PU 1/8/2018 12/14/2017 25 114 60 165.1 174 -> want this
3744 IM 1/8/2018 12/14/2017 25
572 PU 6/8/2011 6/8/2011 0 86 195 208 32.7
572 CL 4/25/2013 3/26/2013 30 124 86 195 207 31.4 -> want this
574 PR 4/3/2013 6/9/2013 63 140 88 172.72 224.5 34.1
575 PR 4/8/2013 3/26/2013 13 158 80 213 32.3 -> want this
I'd like to pull the most complete data on BP, Wt, Ht, and BMI from encounter date that is closest within +/30 from MI_date. Daysdiff column is the duration in days between the two dates. If an encounter closest to MI_date has incomplete data then it needs to search for the next record till a record with complete data is found.
Your help is greatly appreciated!
Please post the data in usable form, have look at How to convert datasets to data steps if you don't know how to post data. Are "encounters" identified by "id" alone?
Since you already have the "daysdiff", you could create another variable "non_missing_values", the function N can help to fill it.
Please post the data in usable form, have look at How to convert datasets to data steps if you don't know how to post data. Are "encounters" identified by "id" alone?
Since you already have the "daysdiff", you could create another variable "non_missing_values", the function N can help to fill it.
I don't know how your desired result looks like, so I assume you want as many observations as in your input data with the desired ht, wt, bmi for each obs.
data have;
input id enc_type $ (enc_date mi_date)(:mmddyy10.) daysdiff 30-32 bp_sys 33-35 bp_dias 37-38 ht 40-45 wt 47-52 bmi 54-58;
format enc_date mi_date mmddyy10.;
datalines;
3669 CL 6/8/2011 6/8/2011 0 126 60
3669 CL 6/15/2011 6/8/2011 7
3669 PR 6/17/2011 6/8/2011 9 136 58
3669 PR 6/23/2011 6/8/2011 15 112 58 175.26 195 28.78
3695 IM 5/14/2014 5/14/2014 0 125 80
3695 IM 5/21/2014 5/14/2014 7 125 70 160.02
3744 PU 1/8/2018 12/14/2017 25 114 60 165.1 174
3744 IM 1/8/2018 12/14/2017 25
572 PU 6/8/2011 6/8/2011 0 86 195 208 32.7
572 CL 4/25/2013 3/26/2013 30 124 86 195 207 31.4
574 PR 4/3/2013 6/9/2013 63 140 88 172.72 224.5 34.1
575 PR 4/8/2013 3/26/2013 13 158 80 213 32.3
;
data want;
do _N_ = 1 by 1 until (last.id);
set have;
by id notsorted;
if abs(enc_date-mi_date) <= 30 then do;
if n(ht, wt, bmi) > n then do;
n = n(ht, wt, bmi);
_ht = ht;
_wt = wt;
_bmi = bmi;
end;
end;
end;
ht = _ht;
wt = _wt;
bmi = _bmi;
do _N_ = 1 to _N_;
set have;
output;
end;
run;
The program works great! Thank so much!
Sorry one more thing - How would you change in the program so that it outputs only the encounters that have the most data of ht, wt, and BMI?
Thanks!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.