SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

I'm using a database which contains ICD-9 codes but in a format without decimal places.  I'm trying to import a list of ICD-9 codes that does contain decimal places and match it up with the one that doesn't.  However I keep running into a problem of ambiguous dx codes in my database that could refer to two ICD-9s (one procedure and one diagnosis.  This is a well known and widely used database so I feel like I must be overlooking something obvious.


IN MY IMPORTED LIST: normal ICD-9 codes with appropriate leading zeros and decimal.

IN MY DATABASE: everything is left justified and decimals are implied at certain places in the string.  Leading zeros are inserted before the number to ensure that the decimal place is implied at the correct location.

However, the rules differ for ICD-9 procedure codes and ICD-9 diagnosis codes (and as a result a single code in my database could refer to a Dx AND a Proc)

  • ICD-9 procedure codes: the decimal is implied after the second digit
    • For example
      • 00.31 would be coded as 0031
      • 01.5 would be coded as 015
      • 01.50 would be coded as 0150
      • 15.5 would be coded as 155
  • ICD-9 diagnosis codes: the decimal is implied after the third digit
    • For example
      • 014 would be coded as 014
      • 014.1 would be coded as 0141
      • 014.11 would be coded as 01411
      • 012.0 would be coded as 0120
      • 001.30 would be coded as 00130
      • 140 would be coded as 140

As far as I can tell, this still leaves ambiguity since ICD procedure and diagnosis codes can take on the same value in the database. Table 1 (pasted below and also attached) breaks down the many ways “14” can be coded in the data.  A number of them result in ambiguous codes (outlined in RED) in my database.

Since not all of these “14” codes actually exist, I provided examples of violations in the far right column.

Table 1. Ambiguous code combinations

PROC vs. DX

ICD-9 CODE

MarketScan code

Examples of ICD-9 codes actually in use

What happens if we just drop the decimal in SAS?

PROC

  1. 00.14

0014

0014

PROC

  1. 01.4

014

  1. 07.5

014

DX

014

014

075

014

PROC

  1. 01.40

0140

  1. 01.20

0140

DX

  1. 014.0

0140

  1. 012.0

0140

DX

  1. 001.4

0014

0014

DX

  1. 001.40

00140

00140

DX

  1. 014.00

01400

01400

PROC

14

14

14

PROC

  1. 14.0

140

  1. 14.0

140

DX

140

140

140

140

PROC

  1. 14.00

1400

  1. 14.00

1400

DX

  1. 140.0

1400

  1. 140.0

1400

DX

  1. 140.00

14000

14000

Thank you for any help/advice/guidance in advance.  Don't know what I'd do with out you guys...


Accepted Solutions
Solution
‎11-09-2013 06:19 PM
Super User
Super User
Posts: 7,074

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

I do not understand why you are having trouble.

Normally your patient data would have had DX codes and PROC codes collected into different variables.

Normally your ICD9 look-up table would separate tables for DX and PROC codes (or at least another variable to indicate which type it is).

Is it possible that someone has stored the values without periods into a numeric variable instead of a character variable? (perhaps by accidentally running it through Excel instead of a real database)  If so that will mess up any code that starts with zero.

View solution in original post


All Replies
Contributor
Posts: 50

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

JUST NOTICED THAT THE TABLE PASTED WRONG INTO THE SAS DISCUSSION FORUM.  PLEASE REFER TO ATTACHED PDF INSTEAD!

THANKS AGAIN.

Respected Advisor
Posts: 4,173

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

I don't have any subject matter expertise here but on a general level from what you describe and show us: Yes, removing the dot leads to ambiguity so in order to match the two data sources you need an additional field giving you the information whether a code is "DX" or "PROC".

Solution
‎11-09-2013 06:19 PM
Super User
Super User
Posts: 7,074

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

I do not understand why you are having trouble.

Normally your patient data would have had DX codes and PROC codes collected into different variables.

Normally your ICD9 look-up table would separate tables for DX and PROC codes (or at least another variable to indicate which type it is).

Is it possible that someone has stored the values without periods into a numeric variable instead of a character variable? (perhaps by accidentally running it through Excel instead of a real database)  If so that will mess up any code that starts with zero.

Contributor
Posts: 50

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

I do have separate fields for procedures that are identified using CPT codes.  However, ICD-9 consists of both ICD-9 diagnosis codes and ICD-9 procedure codes.  So as a result, the "diagnosis" field in my data can actually contain codes that refer to procedures, not diagnoses.  I see what you mean about Excel but the coding scheme is designed by the compilers of the data set, not the result of accidentally changing the formatting. 

Sorry for the time you've all spent thinking about my problem.  I'll get to the bottom of this in the next few days and will be sure to post what I learn on this discussion thread.

Occasional Contributor
Posts: 17

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

Hi

I'm curious about your question since it would appear that we have all of the components that we need to accurately record each of the values.  We have a field which tells us whether this is a diagnosis or procedure, we have the rule indicating whether or not we should position the decimal after the third or second character.

I assume you are trying to match the icd-9 column with the market scan value?  why not drop the decimal in your transform then use the positional logic based on proc vs. dx to recode the variable?

If i missed something critical i apologize.

Happy to work up an example if that is really what you need.

-greg

Contributor
Posts: 50

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

Posted in reply to gregorysnelson

Thanks for your response.  You surmise correctly. Each claim I have refers to a single procedure/service with a number of associated diagnosis codes attached.  As far as I know, I do not have a field which tells me whether any specific diagnosis field (ICD-9) is describing a procedure or a diagnosis and so do not have the information to determine when to apply my two digit decimal rule vs. my three digit decimal rule.

Occasional Contributor
Posts: 17

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

Got it, i saw your table with the first column and thot that was the field you could use.  it would be pretty tough without knowing whether it was a dx or not.  In the market scan data, we usually have dx1 -- dx15 and similar fields for procs. I would go back to the data owner and see if there is another option to understanding logic that you should use as others have suggested.

-greg

Super User
Super User
Posts: 7,074

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

If you are really getting data from MarketScan then the DX fields should have diagnosis codes and the PROC fields will have procedure codes.

Contributor
Posts: 50

Re: Ambiguous ICD-9 (procedure & diagnosis) coding scheme in my database leads to double-matching?

My apologies to all.  I have finally figured this out and realize many of you were giving me the right answer all along but I didn't know enough to recognize it.  As was pointed out (repeatedly) procedure codes go in the procedure field, diagnosis codes in the diagnosis field.  I assumed all ICD-9 codes appeared in the same field.  Rather, my diagnosis field can only hold ICD-9 diagnosis codes.  My procedure field can contain ICD-9 procedure codes, CPT codes, or HCPCS codes.  I have an additional variable which refers to which type of code is being used in the procedure field.  Because CPT and HCPCS codes always have 5 digits and ICD-9 procedure codes always have four, the additional indicator variable isn't even required to differentiate between the two.

Thanks to everyone who was willing to bear with me on this one.  You guys are amazing.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 3361 views
  • 6 likes
  • 4 in conversation