BookmarkSubscribeRSS Feed
Wolverine
Pyrite | Level 9

I'm working with medical claims data, and I'm trying to identify records based on a specific list of procedure and revenue codes.  That specific list is contained in an Excel file.  Matches can be made on procedure OR revenue codes.

 

When I first ran the syntax, I got 14,581 matches.  Then I realized that the revenue codes in the Excel file were listed as 045x rather than separate entries for 0451, 0452, etc (revenue codes are listed as 0451, 0452, etc in the claims data).  As a result, matches were being made on procedure codes only.  So I made expanded the list of revenue codes in the Excel file and ran the syntax again, expecting to add at least a few records on revenue codes that weren't matching on procedure codes. 

 

In fact, I only got 9230 matches.  Everything else in the syntax is identical.  The MHI_Table2 file was imported from Excel, and it contains exactly the information it should.  So it doesn't look like the importation from Excel is the problem.  I don't understand how I could get fewer matches after I increased the number of potential matching codes.  This is the syntax I'm using to match: 

PROC SQL;
	Create table temp.IMG_MHI_qual_ED_match1
		as Select *
	From temp.IMG_MHI_qual_DX_flag_recs_T2
	Where PRCDR_CODE in (select proc_code_MHI_Table2 from temp.MHI_Table2)
		or REVENUE_CODE in (select rev_code_MHI_Table2 from temp.MHI_Table2);
QUIT;

 I'm sure you'll want additional portions of syntax or the files themselves, but I'm not sure what exactly to provide.

1 REPLY 1
ballardw
Super User

First question: How are you bringing the data from Excel into SAS?

 

Assuming your answer to that question is either "Proc Import" or an import wizard then expect continuing problems from varying variable types, lengths and possibly even variable names.

 

Both of those methods invoke algorithms to guess the content of a file and often very few records of the file. So if the first 20 rows of data for a variable only contain 4 characters then that could be the length set for a field and the remainder of the data, i.e. below the 20 rows, is truncated to that.

If this is to be an ongoing process with multiple files of the same layout then you might want to address the reading the data issue now before you start writing lots of fixit code. In the long run, saving data that starts as Excel to CSV and then using  DATA step code to read a consistent layout is much more consistent and repeatable.

 

It isn't clear exactly what change you mad by "expanded the list of revenue codes" or even which table in the syntax you show would represent that file.

Best would be to provide two small input sets, indicating which one is similar to which in the syntax you show (we are not going to have your libraries mapped for instance) as data step code showing just enough variables and values to demonstrate what you have and then an example of the desired result using those two input sets.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1044 views
  • 0 likes
  • 2 in conversation