How to find the best unique one-to-one matches from a many-to-many source?

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

How to find the best unique one-to-one matches from a many-to-many source?

I’ve got two datasets, one identifying patients and dates they visited a doctor’s office and the second identifying the same patients and dates they picked up certain types of prescriptions.  What I need to do is turn this many-to-many relationship between the two tables into a third table identifying the best unique one-to-one matches between the data.  My criteria for the results is actually pretty simple:  a doctor’s visit can only map to a single prescription, neither a visit or a prescription may be reused once they have been matched, and the prescription must have been filled on or within seven days after the date of the visit to the doctor’s office.  So far I’ve been unable to even come really close to doing this in SAS and was hoping someone here might be able to help.  Right now I’m exporting my datasets to SQL Server and using a piece of VB.NET code I wrote to do the matching, but I’d really like to find a way to keep this in SAS if possible as there is some more work I’d like to do on the data.  

 

As an example, my doctor office visit data might look like:

Visit Row ID	Patient ID	Visit Date
1	xxxx0001	10/15/2015
2	xxxx0001	11/1/2015
3	xxxx0001	11/24/2015
4	xxxx0002	9/12/2015
5	xxxx0002	9/15/2015
6	xxxx0002	12/28/2015
7	xxxx0002	1/18/2016
8	xxxx0003	10/10/2015
9	xxxx0003	10/10/2015
10	xxxx0003	10/11/2015
11	xxxx0003	11/2/2015

My prescription data might be:

Rx Row ID	Patient ID	Rx Date
1	xxxx0001	10/14/2015
2	xxxx0001	10/15/2015
3	xxxx0001	11/4/2015
4	xxxx0001	12/3/2105
5	xxxx0002	9/13/2015
6	xxxx0002	9/13/2015
7	xxxx0002	9/15/2105
8	xxxx0002	9/20/2015
9	xxxx0002	12/3/2015
10	xxxx0002	1/20/2016
11	xxxx0003	10/10/2015
12	xxxx0003	10/11/2015
13	xxxx0003	10/15/2015
14	xxxx0003	11/3/2015

And what I'd like to get out is:

Visit Row ID	Rx Row ID
1	2
2	3
4	5
5	7
7	10
8	11
9	13
10	12
11	14

Thank you all very much in advance, any help would be greatly appreciated.

 

Chris Smith


Accepted Solutions
Solution
‎02-19-2016 09:48 AM
Super User
Posts: 10,028

Re: How to find the best unique one-to-one matches from a many-to-many source?

Posted in reply to Chris_Smith73
data visit;
	infile cards expandtabs;
	input (Visit_ID Patient_ID) (:$20.)	Visit_Date :mmddyy10.;
	format visit_date mmddyy10.;
	cards;
1	xxxx0001	10/15/2015
2	xxxx0001	11/1/2015
3	xxxx0001	11/24/2015
4	xxxx0002	9/12/2015
5	xxxx0002	9/15/2015
6	xxxx0002	12/28/2015
7	xxxx0002	1/18/2016
8	xxxx0003	10/10/2015
9	xxxx0003	10/10/2015
10	xxxx0003	10/11/2015
11	xxxx0003	11/2/2015
;

data Rx;
	infile cards expandtabs;
	input (Rx_ID Patient_ID) (:$20.)	Rx_Date :mmddyy10.;
	format Rx_date mmddyy10.;
	cards;
1	xxxx0001	10/14/2015
2	xxxx0001	10/15/2015
3	xxxx0001	11/4/2015
4	xxxx0001	12/3/2015
5	xxxx0002	9/13/2015
6	xxxx0002	9/13/2015
7	xxxx0002	9/15/2015
8	xxxx0002	9/20/2015
9	xxxx0002	12/3/2015
10	xxxx0002	1/20/2016
11	xxxx0003	10/10/2015
12	xxxx0003	10/11/2015
13	xxxx0003	10/15/2015
14	xxxx0003	11/3/2015
;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set Rx;
  declare hash h(dataset:'Rx',multidata:'y');
  h.definekey('Patient_ID','Rx_Date');
  h.definedata('Rx_ID');
  h.definedone();
 end;
set visit;
do i=Visit_Date to Visit_Date+7;
 if h.find(key:Patient_ID,key:i)=0 then do;
  h.removedup(key:Patient_ID,key:i);output;leave;
 end;
end;
keep Visit_ID Rx_ID;
run;

View solution in original post


All Replies
Super User
Posts: 5,429

Re: How to find the best unique one-to-one matches from a many-to-many source?

Posted in reply to Chris_Smith73
Here's a skeleton (untested - written om ny cell phone):
Proc sql;
Create table want as
Select patient_row_ID, visit_row_Id, rx_row_id
From visit, prescription
Where patient.patient_row_id = visit.patient_row_id and rx_date between visit_date and visit_date+7
Group by patient_row_Id, visit_row_id
Having min (prescription_date) = prescription_date
;
Quit;
Data never sleeps
Respected Advisor
Posts: 3,156

Re: How to find the best unique one-to-one matches from a many-to-many source?

[ Edited ]
Posted in reply to Chris_Smith73

This will be typical implementation for Hash Object.

 

data visit;
	infile cards dlm='09'x;
	input (Visit_ID Patient_ID) (:$20.)	Visit_Date :mmddyy10.;
	format visit_date mmddyy10.;
	cards;
1	xxxx0001	10/15/2015
2	xxxx0001	11/1/2015
3	xxxx0001	11/24/2015
4	xxxx0002	9/12/2015
5	xxxx0002	9/15/2015
6	xxxx0002	12/28/2015
7	xxxx0002	1/18/2016
8	xxxx0003	10/10/2015
9	xxxx0003	10/10/2015
10	xxxx0003	10/11/2015
11	xxxx0003	11/2/2015
;

data Rx;
	infile cards dlm='09'x;
	input (Rx_ID Patient_ID) (:$20.)	Rx_Date :mmddyy10.;
	format Rx_date mmddyy10.;
	cards;
1	xxxx0001	10/14/2015
2	xxxx0001	10/15/2015
3	xxxx0001	11/4/2015
4	xxxx0001	12/3/2015
5	xxxx0002	9/13/2015
6	xxxx0002	9/13/2015
7	xxxx0002	9/15/2015
8	xxxx0002	9/20/2015
9	xxxx0002	12/3/2015
10	xxxx0002	1/20/2016
11	xxxx0003	10/10/2015
12	xxxx0003	10/11/2015
13	xxxx0003	10/15/2015
14	xxxx0003	11/3/2015
;

data want;
	if _n_=1 then
		do;
			if 0 then
				set visit rx;
			declare hash rx(dataset:'rx', multidata:'y');
			rx.definekey('patient_id');
	rx.definedata(all:
			'y');
			rx.definedone();
		end;

	set visit;
 /*this loop is to go through patient_id matched items in the Hash and select one that fits the best*/
	if rx.find()=0 then
		do;
			do rc=0 by 0 while (rc=0);
				if visit_date <= rx_date <= visit_date+7 and rx_date > _rx_date then
					do;
						_rx_id=rx_id;
						_rx_date=rx_date;
					end;

				rc=rx.find_next();
			end;
		end;

	rc=rx.find();
/*second loop to remove those selected items from Hash*/
	do rc=0 by 0 while (rc=0);
		if _rx_id=rx_id then
			do;
				rc=rx.removedup(); /*Here is to remove*/
				leave;
			end;

		rc=rx.find_next();
	end;

	rx_id=_rx_id; rx_date=_rx_date;

	if not missing(rx_id);
	drop rc _:;
run;

BTW, your raw data has some typos and your outcome table is a little off (such as some pairs are not even from the same patient).

 

Solution
‎02-19-2016 09:48 AM
Super User
Posts: 10,028

Re: How to find the best unique one-to-one matches from a many-to-many source?

Posted in reply to Chris_Smith73
data visit;
	infile cards expandtabs;
	input (Visit_ID Patient_ID) (:$20.)	Visit_Date :mmddyy10.;
	format visit_date mmddyy10.;
	cards;
1	xxxx0001	10/15/2015
2	xxxx0001	11/1/2015
3	xxxx0001	11/24/2015
4	xxxx0002	9/12/2015
5	xxxx0002	9/15/2015
6	xxxx0002	12/28/2015
7	xxxx0002	1/18/2016
8	xxxx0003	10/10/2015
9	xxxx0003	10/10/2015
10	xxxx0003	10/11/2015
11	xxxx0003	11/2/2015
;

data Rx;
	infile cards expandtabs;
	input (Rx_ID Patient_ID) (:$20.)	Rx_Date :mmddyy10.;
	format Rx_date mmddyy10.;
	cards;
1	xxxx0001	10/14/2015
2	xxxx0001	10/15/2015
3	xxxx0001	11/4/2015
4	xxxx0001	12/3/2015
5	xxxx0002	9/13/2015
6	xxxx0002	9/13/2015
7	xxxx0002	9/15/2015
8	xxxx0002	9/20/2015
9	xxxx0002	12/3/2015
10	xxxx0002	1/20/2016
11	xxxx0003	10/10/2015
12	xxxx0003	10/11/2015
13	xxxx0003	10/15/2015
14	xxxx0003	11/3/2015
;
run;
data want;
 if _n_ eq 1 then do;
  if 0 then set Rx;
  declare hash h(dataset:'Rx',multidata&colon;'y');
  h.definekey('Patient_ID','Rx_Date');
  h.definedata('Rx_ID');
  h.definedone();
 end;
set visit;
do i=Visit_Date to Visit_Date+7;
 if h.find(key:Patient_ID,key:i)=0 then do;
  h.removedup(key:Patient_ID,key:i);output;leave;
 end;
end;
keep Visit_ID Rx_ID;
run;
New Contributor
Posts: 3

Re: How to find the best unique one-to-one matches from a many-to-many source?

Thank you Ksharp, that is exactly what I was looking for and got the matches I expected.

Super User
Posts: 11,343

Re: How to find the best unique one-to-one matches from a many-to-many source?

Posted in reply to Chris_Smith73

This is in the nature of devil's advocate coding review.

Does EVERY visit to the doctor generate a presecription?

 

Reason I ask:

Suppose you have visit data

Visit Row ID	Patient ID	Visit Date
1	xxxx0001	10/15/2015
2	xxxx0001	10/16/2015

And prescription data

Rx Row ID	Patient ID	Rx Date
2	xxxx0001	10/16/2015
3	xxxx0001	11/4/2015

Which record would be the match for the prescription on 10/16/2015?

 

I personally have been in a situation where I was seeing doctors nearly daily for a period, not all visits generating prescriptions, and a date-only comparison would have resulted in this type of data.

 

And could there be multiple doctor visits on the same date?

Or multiple prescriptions picked up on the same date?

 

Just something to think about that may need clarification.

New Contributor
Posts: 3

Re: How to find the best unique one-to-one matches from a many-to-many source?

The short answer is no, not every visit results in a prescrption.  In this project I am only examining certain types of doctor visits and specific medications.  There is in fact, with the data I have on hand, no way that we can say "Patient Z received prescription y from visit x".  What we are trying to do is notify physicians about potential high risk patients, and this piece of the process is only one of three flagging criteria we check for.  An example of what we are really attempting to tell our providers with this particular data is "Patient Z was seen at the ER 4 times in the last 6 months and each of those visits is followed closely by the patient filling a prescription for a controlled medication.  You may want to keep an eye on/chat with this patient".

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 482 views
  • 2 likes
  • 5 in conversation