BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jsmall
Calcite | Level 5

Hi everyone,


I'm working with ICD-9 codes that were entered as text, which means clinicians could have coded certain disorders 2 ways ( i.e. 307.2 or 307.20) or 3 ways (i.e. 303, 303.0, 303.00). FYI- These are formatted as character, not numeric. 

In order to tabulate total number of specific diagnoses, I need these to all be in the same format. How do I add trailing zeros since only 1 is missing in some fields or 2 could be missing (meaning, I can't just multiply by 10 or 100)

 

I've looked through the community questions here but can't find any that work in ths situation. 

Thanks for any help! 

Jen 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You could try something like:

data have;
  input have $;
  cards;
300
300.0
300.000
300.1
300.10
V71.09
V71.1
;

data want(drop=leftpart rightpart x);
  length leftpart $8;
  length rightpart $3;
  length want $8;
  set have;
  x=index(have,'.');
  if x gt 0 then do;
    leftpart=substr(have,1,x-1);
    if x lt length(have) then rightpart=catt('.',substr(have,x+1),'00');
    else rightpart='.00';
  end;
  else do;
    leftpart=have;
    rightpart='.00';
  end;
  want=catt(leftpart,rightpart);
run;

HTH,

Art, CEO, AnalystFinder.com

 

View solution in original post

8 REPLIES 8
koyelghosh
Lapis Lazuli | Level 10

I would have done it like this. I have assumed this is a DATA step and you have a file that you are reading with INFILE Statement. If this assumption is not correct, then you can skip this post.

 

Step 1: In the INFILE statement add the option "PAD".

Step 2: Give the appropriate width of the character (using @ and $w).

Step 2: Format the Column (Say 'Dis_Index') to numeric format with appropriate width and 2 decimals ("FORMAT DIs_Index 8.2;" <--- should be fine in many cases).

 

Something similar was discussed here (https://communities.sas.com/t5/New-SAS-User/Basic-SAS-program/m-p/568180#M11634) but the option of PAD was advised against for obvious reasons. However in this case, your requirement is exactly to PAD everything to 2 decimals, so may be this is what you wanted. If not, I have read the posted problem as wrong.

jsmall
Calcite | Level 5

Hi,

 

Thanks for your answer- I'm importing a large SAS dataset and not using the infile statement just setting it.. I tried using infile and SAS gave me a hard time.  And yes, I'm trying to code this in a data step.

The problem with your solution is that I have V codes, like V71.09, in this variable as well (should have mentioned that earlier) so I can't change these to numeric.  

So, I'm trying to figure out a way to add either no zeros, 1 or 2 zeros to character variables. 

Sorry, I don't have any code written yet, because I'm not sure where to start..

 

Thanks for any help! 

ballardw
Super User

First thing, is the current length of the variable holding the ICD code long enough so that if you need to add one (or two) characters to it that the value will not get truncated?

If the defined length of the code is 6 characters and you have a value like V70.91 and need to add anything to it you can't because that value is already 6 characters long.

 

Second, what is the rule , or rules, for determining if zero, one or two zeroes need to be added?

I know that ICD codes have variable lengths because each number represents some level of code, and the extra periods introduce even more subcode possibilities. So there is some complexity in determining exactly how many characters are involved.

 

You can pull apart an ICD code to get everything before the last . and the bit after it, if any with this code:

data example;
   code='V71.09';
   stem = substr(code,1,findc(code,'.','b')-1);
   numchar= scan(code,-1,'.');
 
run;

However some issues remain. If there is no period this doesn't work because there is no piece after the period. So one case to address.

Another is if the last character is a period then the "numchar" will equal the stem. Another case to consider.

 

Consider means: test for this condition and decide what to do with your value.

 

art297
Opal | Level 21

You could try something like:

data have;
  input have $;
  cards;
300
300.0
300.000
300.1
300.10
V71.09
V71.1
;

data want(drop=leftpart rightpart x);
  length leftpart $8;
  length rightpart $3;
  length want $8;
  set have;
  x=index(have,'.');
  if x gt 0 then do;
    leftpart=substr(have,1,x-1);
    if x lt length(have) then rightpart=catt('.',substr(have,x+1),'00');
    else rightpart='.00';
  end;
  else do;
    leftpart=have;
    rightpart='.00';
  end;
  want=catt(leftpart,rightpart);
run;

HTH,

Art, CEO, AnalystFinder.com

 

jsmall
Calcite | Level 5

I don't know what's going on in those steps, but it worked! 

Thank you!!  🙂 

 

 

jsmall
Calcite | Level 5

Can I ask one quick question? I tried to make an array to do this across 5 different variables. My code is below. The only output I'm getting is for the last variable (ax3) . Any suggestions? Thank you!!

 

 

data want;
length leftpart $8;
length rightpart $3;
length ax1 ax12 axsa ax2 ax3 $8;
set n6;


array sub {5} axis_i_mh axis_i_mh2 axis_i_sa axis_ii axis_iii ;
array x {5} x1 x2 x3 x4 x5 ;
do i=1 to 5;
x {5}=index(sub {5},'.');
if x {5} gt 0 then do;
leftpart=substr(sub {5},1,x {5}-1);
if x {5} lt length(sub {5}) then rightpart=catt('.',substr(sub{5},x{5}+1),'00');
else rightpart='.00';
end;
else do;
leftpart=sub{5};
rightpart='.00';
end; 

 

array sub2 {5} ax1 ax12 axsa ax2 ax3 ;
do i=1 to 5;
sub2 {5}=catt(leftpart,rightpart);
end;
end;


drop leftpart rightpart x1 x2 x3 x4 x5 ;
run;

art297
Opal | Level 21

Try something like:

data have;
  infile cards truncover;
  input (axis_i_mh axis_i_mh2 axis_i_sa axis_ii axis_iii)  ($);
  cards;
300 20.5 200.5
300.0 200 400.1 500.10
300.000 400.000 50.000 60.000 70.000
300.1
300.10
V71.09
V71.1
;

data want(drop=leftpart rightpart x);
  length leftpart $8;
  length rightpart $3;
  length want1-want5 $8;
  set have;
  array have(5) $ axis_i_mh axis_i_mh2 axis_i_sa axis_ii axis_iii;
  array want(5) $;
  do i=1 to 5;
    if not missing(have(i)) then do;
      x=index(have(i),'.');
      if x gt 0 then do;
        leftpart=substr(have(i),1,x-1);
        if x lt length(have(i)) then rightpart=catt('.',substr(have(i),x+1),'00');
        else rightpart='.00';
      end;
      else do;
        leftpart=have(i);
        rightpart='.00';
      end;
      want(i)=catt(leftpart,rightpart);
    end;
  end;
run;

If you can be more specific regarding what you don't understand it should be easy to explain those part(s) of the code.

 

Art, CEO, AnalystFinder.com

 

jsmall
Calcite | Level 5

Hi, thanks again! Works perfectly! 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1865 views
  • 0 likes
  • 4 in conversation