Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

How can I list a range of character values without having to type each one of them?

Reply
Occasional Contributor
Posts: 7

How can I list a range of character values without having to type each one of them?

I have ICD9 codes in character data type. I would like to create new numeric variables based on the values in the ICD codes variables. I do not want to be typing every codes there is, and I don't know them all.

ICD1 codes = 462, 464, 465, 466, 480 thru 488; 507.0; 997.31 >> Respiratory = 1, else = 0;

ICD1 codes = 599.0 thru 599.5 >> Infectious = 1; else = 0;

ICD1 codes = 787.20 thru 787.29 >> Dysphagia = 1

ICD1 codes = 308 and 310

ICD1 codes = 707.00 thru 707.9 >> SkinUlcer = 1

Thanks!!

Super User
Posts: 5,516

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to RoseRahman

In a DATA step, try:

icd1_numeric = input(icd1, 8.);

At least for ICD9 codes, there are some that contain letters.  This won't work unless the incoming value is purely numeric.

Good luck.

Occasional Contributor
Posts: 7

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to Astounding

They do contains letters. In fact, I have 20 ICD code variables. So, I can't change the data type to numeric as you suggested.

Super User
Super User
Posts: 7,977

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to RoseRahman

I would suggest going to the ICD web page and downloading the latest code list - they are provided in txt and Excel, so you can easliy import them and then use that as your merge to list:

ICD-9-CM Diagnosis and Procedure Codes: Abbreviated and Full Code Titles - Centers for Medicare & Me...

As for your categorisation, 462 = respiratory, I do not know how you come to that conclusion, as there doesn't appear to be that classification in ICD? 

Occasional Contributor
Posts: 7

Re: How can I list a range of character values without having to type each one of them?

I don't quite understand what your suggestions are. I pulled the xls file but the codes didn't look like anything I have (no decimals). My problem is not trying to find the ICD codes for the categories but trying to list the codes that are in sequence without having to type them all (because I couldn't possibly knows each one of them). I wanted to see if I can do something like this that work for numeric variables.

     if ICD1 in (462, 464, 465, 466, 480-488, 507.0, 997.31) then Respiratory = 1; else Respiratory = 0;

Any ideas how to deal with character variables?

Occasional Contributor
Posts: 9

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to RoseRahman

I'm not a 100% clear on what you're trying to do but I'm assuming you've got a file with a code and its value (description). You want to assign your own value based on some kind of logic.

I would suggest importing your original list into a dataset, then using proc format to read from it and applying your logic to assign whatever values you want.

I know this isn't a very elaborate answer, but its all I can suggest based on my understanding.

Good luck.

Super User
Posts: 5,516

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to RoseRahman

First a word of warning ... if you don't know all the values you are looking for, you can never be sure that any solution is correct.

There are ways to abbreviate character searches.  Consider this:

if icd1 =: '599.' then infectious=1; else infectious=0;

For any icd1 that begins with the characters "599.", the statement assigns 1.  That would include other values that perhaps should not be assigned in that way, such as "599.9" and "599.V".  So there are risks.

You can use this technique with a list of values:

if icd1 in: ('466.', '488.') then respiratory=1; else respiratory=0;

Of course, you'll have to expand the list, but you don't need to know all the "488." series of values. 

Finally, when using :, make sure you code the decimal point.  If you were to code without a decimal point you are taking risks:

if icd1 in: ('466', '488') then ...

This would also give you a match for 4-digit codes that begin with 466 or 488, such as "4662.1".  I'm not saying that this is a valid code (I don't really know), but you never know what will actually appear in your data.

Good luck.

Occasional Contributor
Posts: 7

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to Astounding

I think you understood what I'm trying to do. Yes, I do not know all the exact values down to 2 decimals. But I'll give your code a try. Thank you.

Super User
Posts: 11,343

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to RoseRahman

If your source document has "like" codes together Proc format may be a way to go. But without seeing a source it is a bit difficult to make specific suggestions.

Did someone provide a document of which codes get which assignment? If you could share that, we might have a few ideas.

Occasional Contributor
Posts: 7

Re: How can I list a range of character values without having to type each one of them?

They are stated in my initial posting. The only different is that I have ICD1 through ICD20. So for all of these ICD variables I need to recode them as below. The highlighted values are the ones I have problems with since character values doesn't work that way.

array _icdvar ICD1--ICD20;

     do over _icdvar;

          if _icdvar in ('462', '464', '465', '466', '480--488', '507.0', '997.31' then Respiratory = 1, else Respiratory = 0;

          if _icdvar in ('599.0--599.5') then Infectious = 1; else Infectious = 0;

          if _icdvar in ('787.20'--'787.29') then Dysphagia = 1; else Dysphagia = 0;

          if _icdvar in ('308', '310') then Agititation = 1; else Agitation = 0;

          if _icdvar in ('707.00'--'707.9') then SkinUlcer = 1; else SkinuUlcer = 0;

     end;

Super User
Super User
Posts: 7,977

Re: How can I list a range of character values without having to type each one of them?

Posted in reply to RoseRahman

Just the character to a number version then:

array _icdvar ICD1--ICD20;

     do over _icdvar;

          if input(_icdvar,best.) in (462,464,465,466,507,997.31) or (480 <= tmp_var 488) then Respiratory = 1, else Respiratory = 0;

...

     end;

Ask a Question
Discussion stats
  • 10 replies
  • 649 views
  • 4 likes
  • 5 in conversation