I am wondering what the best approach is for creating a data set that has only one row for each patient, but maintains the row for that patient with the most data. For example I started with this: proc sql; create table table_two as select distinct patientid condition diagnosis DateofBirth sex address city zip county from master_file quit; run; I want one row for every patientid, but if one row is complete (or more complete) for all the variables and another is missing all the demographics for example, I would want to keep the row with the most information. Any guidance on the best approach is greatly appreciated.
... View more