BookmarkSubscribeRSS Feed
Celina1
Calcite | Level 5

I am wondering what the best approach is for creating a data set that has only one row for each patient, but maintains the row for that patient with the most data. For example I started with this:

 

proc sql;
create table table_two
as select distinct  patientid condition diagnosis DateofBirth sex address city zip county from master_file
quit;
run;

 

I want one row for every patientid, but if one row is complete (or more complete) for all the variables and another is missing all the demographics for example, I would want to keep the row with the most information.

 

Any guidance on the best approach is greatly appreciated.

2 REPLIES 2
Reeza
Super User

Create a completeness measure for each row and then take the one with the highest level of completeness. 

This of course assumes you cannot 'fix' the data and fill in data that's missing from other rows to form a complete/more complete row instead.

 


@Celina1 wrote:

I am wondering what the best approach is for creating a data set that has only one row for each patient, but maintains the row for that patient with the most data. For example I started with this:

 

proc sql;
create table table_two
as select distinct  patientid condition diagnosis DateofBirth sex address city zip county from master_file
quit;
run;

 

I want one row for every patientid, but if one row is complete (or more complete) for all the variables and another is missing all the demographics for example, I would want to keep the row with the most information.

 

Any guidance on the best approach is greatly appreciated.


 

ballardw
Super User

Example data and a clear definition of "most data" or "more complete" would be extremely useful for a complete answer.

 

I can easily envision a data set with 100 variable and records with 95 or more of the variables with values but not complete because one or two specific variables are missing.

Or that have many variables with the code for "not actually recorded for some reason".

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 594 views
  • 0 likes
  • 3 in conversation