I have patient encounter data where each observation is a visit from a patient, so 1 patient will have multiple lines. Some of the data is randomly missing. For example, dob might be missing at any one of the visits but is recorded at other visits. I have picked out the date of interest for each patient but because of the randomly missing data, I get missing values. How can I obtain the data (such as dob) from other visits? I am thinking use the coalesce function but not sure how to write that syntax. Do I need to turn it into wide data?
data pt_data;
input uniq_ID_B $ DOB:MMDDYY10. Visit_Date:MMDDYY10. Visit_Reason_Code $ Other_Demographic_Data $ ;
format DOB MMDDYY10. Visit_Date MMDDYY10.;
datalines;
2996 . 9/24/2016 6 xzxz
2996 9/25/1990 11/15/2016 6 xzxz
2996 9/25/1990 12/5/2016 4 xzxz
2996 . 12/15/2016 4 xzxz
5299 . 12/1/2016 2 xzxz
5299 . 12/27/2016 5 xzxz
5299 12/1/1984 2/21/2017 5 xzxz
5299 12/1/1984 3/14/2017 1 xzxz
5299 . 4/26/2017 6 xzxz
5299 12/1/1984 5/21/2017 5 xzxz
4027 6/3/1990 6/2/2021 1 xzxz
4027 . 7/5/2021 4 xzxz
4027 . 7/12/2021 1 xzxz
4027 8/3/1990 7/5/2021 4 xzxz
;
Patient 4027 has two different birth dates. What do you want to happen if a patient has multiple birth dates?
You could use logic like this to get patient birth dates:
proc sql;
create table want as
select uniq_ID_B
,max(DOB) as DOB_Max format = mmddyy10.
from pt_data
where not missing(DOB)
group by uniq_ID_B;
quit;
Patient 4027 has two different birth dates. What do you want to happen if a patient has multiple birth dates?
You could use logic like this to get patient birth dates:
proc sql;
create table want as
select uniq_ID_B
,max(DOB) as DOB_Max format = mmddyy10.
from pt_data
where not missing(DOB)
group by uniq_ID_B;
quit;
That works! It would never be the case where 1 patient has 2 birthdays. That was a typo. Of course, I can't post real patients' data here!
@axescot78 wrote:
That works! It would never be the case where 1 patient has 2 birthdays. That was a typo. Of course, I can't post real patients' data here!
Don't bet on that.
I have some medical testing data that I check for and routinely see multiple values for birth date (for tests on the same day some times), race, ethnicity and gender. Not to mention medical record systems that will report a birth date of 1 Jan 1900 when an actual date isn't entered.
I have seen entries where it was pretty obvious that two billings were combined for different patients on the same day but the tests were recorded with the same patient identification.
The project that I see this stuff in is not even that large, about 5,000 records a year.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.