Merging Issue

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Merging Issue

I am working with Medicare claims data and am trying to merge two datasets (TRIGGERS and CLAIMS).  TRIGGERS has the variables bene_id, claim_id and date.  In TRIGGERS there is one claim (observation) per bene_id.  CLAIMS has bene_id, claim_id, date and diagnostic codes. In CLAIMS, each bene_id may have multiple claims (multiple observations).  I want to create a new dataset, LOOKBACK, that keeps only the bene_ids that are in both TRIGGERS and CLAIMS and keeps multiple claims (observations) for every bene_id.  I expected the following code to work, but I seemed to lose the claim from TRIGGER.  Any suggestions?

 

 proc sort data = out.triggers; by bene_id; run;

 proc sort data = work.claims; by bene_id; run;

data lookback;
merge out.triggers (in=a) work.claims (in=b);
by bene_id; if a and b;
run;


Accepted Solutions
Solution
‎03-29-2017 02:13 PM
Respected Advisor
Posts: 4,998

Re: Merging Issue

Your final data set can only have one variable named CLAIM_ID.  If it appears in both data sets and you want both, this is one possible change to handle it:

 

merge out.triggers (in=a rename=(claim_id=trigger_claim_id)) work.claims (in=b);

View solution in original post


All Replies
Super Contributor
Posts: 284

Re: Merging Issue

What does the log say? If none of these match you should be getting zero obs due to the "if a and b;" Have you compared the values for bene_id in both data sets to make sure they have any overlap at all?

Occasional Contributor
Posts: 7

Re: Merging Issue

There is overlap.  This is the log message:

 

NOTE: There were 711292 observations read from the data set TRIGGERS.
NOTE: There were 86214334 observations read from the data set CLAIMS.
NOTE: The data set WORK.LOOKBACK has 23448038 observations and 17 variables.

But, when I look at the data, the date from the TRIGGER observation has been lost.

Solution
‎03-29-2017 02:13 PM
Respected Advisor
Posts: 4,998

Re: Merging Issue

Your final data set can only have one variable named CLAIM_ID.  If it appears in both data sets and you want both, this is one possible change to handle it:

 

merge out.triggers (in=a rename=(claim_id=trigger_claim_id)) work.claims (in=b);

Occasional Contributor
Posts: 7

Re: Merging Issue

That did it!  Thank you!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 147 views
  • 0 likes
  • 3 in conversation