BookmarkSubscribeRSS Feed
RoseRahman
Calcite | Level 5

I have ICD9 codes in character data type. I would like to create new numeric variables based on the values in the ICD codes variables. I do not want to be typing every codes there is, and I don't know them all.

ICD1 codes = 462, 464, 465, 466, 480 thru 488; 507.0; 997.31 >> Respiratory = 1, else = 0;

ICD1 codes = 599.0 thru 599.5 >> Infectious = 1; else = 0;

ICD1 codes = 787.20 thru 787.29 >> Dysphagia = 1

ICD1 codes = 308 and 310

ICD1 codes = 707.00 thru 707.9 >> SkinUlcer = 1

Thanks!!

10 REPLIES 10
Astounding
PROC Star

In a DATA step, try:

icd1_numeric = input(icd1, 8.);

At least for ICD9 codes, there are some that contain letters.  This won't work unless the incoming value is purely numeric.

Good luck.

RoseRahman
Calcite | Level 5

They do contains letters. In fact, I have 20 ICD code variables. So, I can't change the data type to numeric as you suggested.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would suggest going to the ICD web page and downloading the latest code list - they are provided in txt and Excel, so you can easliy import them and then use that as your merge to list:

ICD-9-CM Diagnosis and Procedure Codes: Abbreviated and Full Code Titles - Centers for Medicare & Me...

As for your categorisation, 462 = respiratory, I do not know how you come to that conclusion, as there doesn't appear to be that classification in ICD? 

RoseRahman
Calcite | Level 5

I don't quite understand what your suggestions are. I pulled the xls file but the codes didn't look like anything I have (no decimals). My problem is not trying to find the ICD codes for the categories but trying to list the codes that are in sequence without having to type them all (because I couldn't possibly knows each one of them). I wanted to see if I can do something like this that work for numeric variables.

     if ICD1 in (462, 464, 465, 466, 480-488, 507.0, 997.31) then Respiratory = 1; else Respiratory = 0;

Any ideas how to deal with character variables?

BecomingSASsy
Calcite | Level 5

I'm not a 100% clear on what you're trying to do but I'm assuming you've got a file with a code and its value (description). You want to assign your own value based on some kind of logic.

I would suggest importing your original list into a dataset, then using proc format to read from it and applying your logic to assign whatever values you want.

I know this isn't a very elaborate answer, but its all I can suggest based on my understanding.

Good luck.

Astounding
PROC Star

First a word of warning ... if you don't know all the values you are looking for, you can never be sure that any solution is correct.

There are ways to abbreviate character searches.  Consider this:

if icd1 =: '599.' then infectious=1; else infectious=0;

For any icd1 that begins with the characters "599.", the statement assigns 1.  That would include other values that perhaps should not be assigned in that way, such as "599.9" and "599.V".  So there are risks.

You can use this technique with a list of values:

if icd1 in: ('466.', '488.') then respiratory=1; else respiratory=0;

Of course, you'll have to expand the list, but you don't need to know all the "488." series of values. 

Finally, when using :, make sure you code the decimal point.  If you were to code without a decimal point you are taking risks:

if icd1 in: ('466', '488') then ...

This would also give you a match for 4-digit codes that begin with 466 or 488, such as "4662.1".  I'm not saying that this is a valid code (I don't really know), but you never know what will actually appear in your data.

Good luck.

RoseRahman
Calcite | Level 5

I think you understood what I'm trying to do. Yes, I do not know all the exact values down to 2 decimals. But I'll give your code a try. Thank you.

ballardw
Super User

If your source document has "like" codes together Proc format may be a way to go. But without seeing a source it is a bit difficult to make specific suggestions.

Did someone provide a document of which codes get which assignment? If you could share that, we might have a few ideas.

RoseRahman
Calcite | Level 5

They are stated in my initial posting. The only different is that I have ICD1 through ICD20. So for all of these ICD variables I need to recode them as below. The highlighted values are the ones I have problems with since character values doesn't work that way.

array _icdvar ICD1--ICD20;

     do over _icdvar;

          if _icdvar in ('462', '464', '465', '466', '480--488', '507.0', '997.31' then Respiratory = 1, else Respiratory = 0;

          if _icdvar in ('599.0--599.5') then Infectious = 1; else Infectious = 0;

          if _icdvar in ('787.20'--'787.29') then Dysphagia = 1; else Dysphagia = 0;

          if _icdvar in ('308', '310') then Agititation = 1; else Agitation = 0;

          if _icdvar in ('707.00'--'707.9') then SkinUlcer = 1; else SkinuUlcer = 0;

     end;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just the character to a number version then:

array _icdvar ICD1--ICD20;

     do over _icdvar;

          if input(_icdvar,best.) in (462,464,465,466,507,997.31) or (480 <= tmp_var 488) then Respiratory = 1, else Respiratory = 0;

...

     end;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 3314 views
  • 4 likes
  • 5 in conversation