Solved: Re: drop or add 1 or 2 trailing zeros in ICD-9 codes

jsmall · Posted 06-25-2019 10:36 AM

Hi everyone,

I'm working with ICD-9 codes that were entered as text, which means clinicians could have coded certain disorders 2 ways ( i.e. 307.2 or 307.20) or 3 ways (i.e. 303, 303.0, 303.00). FYI- These are formatted as character, not numeric.

In order to tabulate total number of specific diagnoses, I need these to all be in the same format. How do I add trailing zeros since only 1 is missing in some fields or 2 could be missing (meaning, I can't just multiply by 10 or 100)

I've looked through the community questions here but can't find any that work in ths situation.

Thanks for any help!

Jen

art297 · Posted 06-25-2019 02:02 PM

You could try something like:

data have;
  input have $;
  cards;
300
300.0
300.000
300.1
300.10
V71.09
V71.1
;

data want(drop=leftpart rightpart x);
  length leftpart $8;
  length rightpart $3;
  length want $8;
  set have;
  x=index(have,'.');
  if x gt 0 then do;
    leftpart=substr(have,1,x-1);
    if x lt length(have) then rightpart=catt('.',substr(have,x+1),'00');
    else rightpart='.00';
  end;
  else do;
    leftpart=have;
    rightpart='.00';
  end;
  want=catt(leftpart,rightpart);
run;

HTH,

Art, CEO, AnalystFinder.com

View solution in original post

koyelghosh · Posted 06-25-2019 12:15 PM

I would have done it like this. I have assumed this is a DATA step and you have a file that you are reading with INFILE Statement. If this assumption is not correct, then you can skip this post.

Step 1: In the INFILE statement add the option "PAD".

Step 2: Give the appropriate width of the character (using @ and $w).

Step 2: Format the Column (Say 'Dis_Index') to numeric format with appropriate width and 2 decimals ("FORMAT DIs_Index 8.2;" <--- should be fine in many cases).

Something similar was discussed here (https://communities.sas.com/t5/New-SAS-User/Basic-SAS-program/m-p/568180#M11634) but the option of PAD was advised against for obvious reasons. However in this case, your requirement is exactly to PAD everything to 2 decimals, so may be this is what you wanted. If not, I have read the posted problem as wrong.

jsmall · Posted 06-25-2019 01:06 PM

Hi,

Thanks for your answer- I'm importing a large SAS dataset and not using the infile statement just setting it.. I tried using infile and SAS gave me a hard time. And yes, I'm trying to code this in a data step.

The problem with your solution is that I have V codes, like V71.09, in this variable as well (should have mentioned that earlier) so I can't change these to numeric.

So, I'm trying to figure out a way to add either no zeros, 1 or 2 zeros to character variables.

Sorry, I don't have any code written yet, because I'm not sure where to start..

Thanks for any help!

ballardw · Posted 06-25-2019 01:48 PM

First thing, is the current length of the variable holding the ICD code long enough so that if you need to add one (or two) characters to it that the value will not get truncated?

If the defined length of the code is 6 characters and you have a value like V70.91 and need to add anything to it you can't because that value is already 6 characters long.

Second, what is the rule , or rules, for determining if zero, one or two zeroes need to be added?

I know that ICD codes have variable lengths because each number represents some level of code, and the extra periods introduce even more subcode possibilities. So there is some complexity in determining exactly how many characters are involved.

You can pull apart an ICD code to get everything before the last . and the bit after it, if any with this code:

data example;
   code='V71.09';
   stem = substr(code,1,findc(code,'.','b')-1);
   numchar= scan(code,-1,'.');
 
run;

However some issues remain. If there is no period this doesn't work because there is no piece after the period. So one case to address.

Another is if the last character is a period then the "numchar" will equal the stem. Another case to consider.

Consider means: test for this condition and decide what to do with your value.

art297 · Posted 06-25-2019 02:02 PM

You could try something like:

data have;
  input have $;
  cards;
300
300.0
300.000
300.1
300.10
V71.09
V71.1
;

data want(drop=leftpart rightpart x);
  length leftpart $8;
  length rightpart $3;
  length want $8;
  set have;
  x=index(have,'.');
  if x gt 0 then do;
    leftpart=substr(have,1,x-1);
    if x lt length(have) then rightpart=catt('.',substr(have,x+1),'00');
    else rightpart='.00';
  end;
  else do;
    leftpart=have;
    rightpart='.00';
  end;
  want=catt(leftpart,rightpart);
run;

HTH,

Art, CEO, AnalystFinder.com

jsmall · Posted 06-25-2019 02:10 PM

I don't know what's going on in those steps, but it worked!

Thank you!! 🙂

jsmall · Posted 06-25-2019 02:40 PM

Can I ask one quick question? I tried to make an array to do this across 5 different variables. My code is below. The only output I'm getting is for the last variable (ax3) . Any suggestions? Thank you!!

data want;
length leftpart $8;
length rightpart $3;
length ax1 ax12 axsa ax2 ax3 $8;
set n6;

array sub {5} axis_i_mh axis_i_mh2 axis_i_sa axis_ii axis_iii ;
array x {5} x1 x2 x3 x4 x5 ;
do i=1 to 5;
x {5}=index(sub {5},'.');
if x {5} gt 0 then do;
leftpart=substr(sub {5},1,x {5}-1);
if x {5} lt length(sub {5}) then rightpart=catt('.',substr(sub{5},x{5}+1),'00');
else rightpart='.00';
end;
else do;
leftpart=sub{5};
rightpart='.00';
end;

array sub2 {5} ax1 ax12 axsa ax2 ax3 ;
do i=1 to 5;
sub2 {5}=catt(leftpart,rightpart);
end;
end;

drop leftpart rightpart x1 x2 x3 x4 x5 ;
run;

art297 · Posted 06-25-2019 03:11 PM

Try something like:

data have;
  infile cards truncover;
  input (axis_i_mh axis_i_mh2 axis_i_sa axis_ii axis_iii)  ($);
  cards;
300 20.5 200.5
300.0 200 400.1 500.10
300.000 400.000 50.000 60.000 70.000
300.1
300.10
V71.09
V71.1
;

data want(drop=leftpart rightpart x);
  length leftpart $8;
  length rightpart $3;
  length want1-want5 $8;
  set have;
  array have(5) $ axis_i_mh axis_i_mh2 axis_i_sa axis_ii axis_iii;
  array want(5) $;
  do i=1 to 5;
    if not missing(have(i)) then do;
      x=index(have(i),'.');
      if x gt 0 then do;
        leftpart=substr(have(i),1,x-1);
        if x lt length(have(i)) then rightpart=catt('.',substr(have(i),x+1),'00');
        else rightpart='.00';
      end;
      else do;
        leftpart=have(i);
        rightpart='.00';
      end;
      want(i)=catt(leftpart,rightpart);
    end;
  end;
run;

If you can be more specific regarding what you don't understand it should be easy to explain those part(s) of the code.

Art, CEO, AnalystFinder.com

jsmall · Posted 06-25-2019 03:27 PM

Hi, thanks again! Works perfectly!

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!