BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
moreilly
Calcite | Level 5

I am working with Medicare claims data and am trying to merge two datasets (TRIGGERS and CLAIMS).  TRIGGERS has the variables bene_id, claim_id and date.  In TRIGGERS there is one claim (observation) per bene_id.  CLAIMS has bene_id, claim_id, date and diagnostic codes. In CLAIMS, each bene_id may have multiple claims (multiple observations).  I want to create a new dataset, LOOKBACK, that keeps only the bene_ids that are in both TRIGGERS and CLAIMS and keeps multiple claims (observations) for every bene_id.  I expected the following code to work, but I seemed to lose the claim from TRIGGER.  Any suggestions?

 

 proc sort data = out.triggers; by bene_id; run;

 proc sort data = work.claims; by bene_id; run;

data lookback;
merge out.triggers (in=a) work.claims (in=b);
by bene_id; if a and b;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Your final data set can only have one variable named CLAIM_ID.  If it appears in both data sets and you want both, this is one possible change to handle it:

 

merge out.triggers (in=a rename=(claim_id=trigger_claim_id)) work.claims (in=b);

View solution in original post

4 REPLIES 4
collinelliot
Barite | Level 11

What does the log say? If none of these match you should be getting zero obs due to the "if a and b;" Have you compared the values for bene_id in both data sets to make sure they have any overlap at all?

moreilly
Calcite | Level 5

There is overlap.  This is the log message:

 

NOTE: There were 711292 observations read from the data set TRIGGERS.
NOTE: There were 86214334 observations read from the data set CLAIMS.
NOTE: The data set WORK.LOOKBACK has 23448038 observations and 17 variables.

But, when I look at the data, the date from the TRIGGER observation has been lost.

Astounding
PROC Star

Your final data set can only have one variable named CLAIM_ID.  If it appears in both data sets and you want both, this is one possible change to handle it:

 

merge out.triggers (in=a rename=(claim_id=trigger_claim_id)) work.claims (in=b);

moreilly
Calcite | Level 5

That did it!  Thank you!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1505 views
  • 0 likes
  • 3 in conversation