How to compare two datasets and exclude IDs that are not included in one of the file

Reply
Occasional Contributor
Posts: 5

How to compare two datasets and exclude IDs that are not included in one of the file

Hi. I am working on Medicare Claims data and want to ask a question about how to compare two datasets and exclude IDs that are not included in one of the file. There are some tricky components that I need to take into account, so I am so confused and don't know how to tackle this. Could anyone make any suggestions to approach this task or demonstrate codes to program this?

 

 

The details for my question are as follow: 


I have a master file that has a list of my study sample. My sample is a group of patients (var: bene_id)
who died of cancer in 2013-2014. This dataset has a variable of death date (var: deat_dt).

 

AND additionally

I have three datasets for each year 2012 (dataset name: Lst_pcs12), 2013 (Lst_pcs13), and 2014 (Lst_pcs14) and each dataset has the list of those who have received at least one primary care service by primary care physicians, along with
service date (service_dt) for each line of primary care service claim.


What I have to do is to identify and exclude from my master file anyone who does not have received primary care service by
PCP. I think I will need to apply a rule that everyone should have at least one primary care service during the last year (12 months)
prior to death. Could anyone help me out? What should I try? 

 

 

I am using SAS 9.3. I sincerely appreciate your help and attention to this inquiry! I am stuck after I created file for a subgroup so I apologize for not being able add my attempted code. Thank you so much!

 

 

 

Respected Advisor
Posts: 4,674

Re: How to compare two datasets and exclude IDs that are not included in one of the file

[ Edited ]

@hkim30

Append your 3 datasets into one. Keep only the last service date per patient.

Use a SQL inner join to combine your master with the service date DS and with a join condition along the line of below:

master.id=service.id and  0<=intck('month',service.dt_service,master.dt_death)<=12

 

Changed pseudo code as suggested by @mkeintz

Trusted Advisor
Posts: 1,312

Re: How to compare two datasets and exclude IDs that are not included in one of the file

@Patrick

 

I think you mean   "0<=intck ...",   not   "0>=intck ...".

Esteemed Advisor
Posts: 5,482

Re: How to compare two datasets and exclude IDs that are not included in one of the file

@Patrick's suggestion could be implemented as:

 

data lst_pcs;
set lst_pcs12 lst_pcs13 lst_pcs14;
run;

proc sql;
create table want as
select 
	a.*,
	b.*
from
	master as a inner join
	(	select * 
		from lst_pcs 
		group by bene_id 
		having service_dt = max(service_dt)) as b 
			on 	a.bene_id = b.bene_id and 
				intck("MONTH", b.service_dt, a.deat_dt) between 0 and 12;
quit;

(untested)

 

PG
Ask a Question
Discussion stats
  • 3 replies
  • 113 views
  • 0 likes
  • 4 in conversation