SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Claims data: clean data first, or query out patients first?

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Claims data: clean data first, or query out patients first?

Hi all, 

 

I've been given an algorithm to select patients who have a certain type of illness. Given a set of ICD-9 codes + inclusion procedure codes, + other criteria (age, region, etc). 

 

Generally with claims data (this is Truven) -- should I clean the entire set first, and then isolate my sample, or isolate my sample and then clean?

 

Thanks, 


Accepted Solutions
Solution
a week ago
Super User
Posts: 11,578

Re: Claims data: clean data first, or query out patients first?

I agree in general with @Reeza but experience has taught me if age is involved to always at least check it early in any process where it is important.

Finding data like date of birth after the date a service is performed or age (not to mention gender) inappropriate services might be a concern.

 

You may also have to consider age at time of service vs age at data extract depending on your data systems. Many systems will maintain demographics such as birth date separately from services and may calculate an age based on the date of the extract for each record even though the services were on different dates.

View solution in original post


All Replies
Super User
Posts: 20,252

Re: Claims data: clean data first, or query out patients first?


cdubs wrote:

Hi all, 

 

I've been given an algorithm to select patients who have a certain type of illness. Given a set of ICD-9 codes + inclusion procedure codes, + other criteria (age, region, etc). 

 

Generally with claims data (this is Truven) -- should I clean the entire set first, and then isolate my sample, or isolate my sample and then clean?

 

Thanks, 


Depends on your cleaning process. If the cleaning process can affect selection then it needs to go first. 

Solution
a week ago
Super User
Posts: 11,578

Re: Claims data: clean data first, or query out patients first?

I agree in general with @Reeza but experience has taught me if age is involved to always at least check it early in any process where it is important.

Finding data like date of birth after the date a service is performed or age (not to mention gender) inappropriate services might be a concern.

 

You may also have to consider age at time of service vs age at data extract depending on your data systems. Many systems will maintain demographics such as birth date separately from services and may calculate an age based on the date of the extract for each record even though the services were on different dates.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 188 views
  • 4 likes
  • 3 in conversation