BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Bal23
Lapis Lazuli | Level 10

For those either contain icd9 or icd10 i want to develop a program that is

 

if

if upcase(substr(icd, 1, 1)) in ( 'E' )  and substr(icd, 4, 1)) in ('.") then

also I need to find, other than "V", "E", then the value for ICd9 should be numerical, \

 

Any advice on how to do it?

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
Diamond | Level 26

Hi:
  To work with your IF statements, somebody has to
1) make up some FAKE data and those who do not know ICD9 or ICD10 will make up the wrong type of FAKE data -- so you could help by posting an example of some data.
 
2) after they make up the FAKE data, then they have to make up a program -- you could help by supplying more of the program than just IF statements.

SAS has several functions that might help you here:
ANYDIGIT and ANYALPHA also NOTDIGIT and NOTALPHA
Check out the examples in the documentation:
http://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#p1wmqu81yx6jwjn1tdh...
using ANYALPHA and ANYDIGIT for testing might be easier than doing what you're doing.

For example if you extract a substr of 4 characters like '2345', it will NOT match single digits such as you show, which is probably why your IF is not working.

cynthia

 

Here's a program to try that illustrates a better use of ANYALPHA and ANYDIGIT:

data wrongtest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';

  next4 = substr(code,2,4);
  if next4 in ('0','1', '2', '3', '4', '5', '6', '7', '8', '9') then got_number = 'y';
run;
 
proc print data=wrongtest;
run;
  
data righttest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';
 
  next4 = substr(code,2,4);
  ** if anyalpha = 0 means no alpha characters in string;
  ** so if anydigit is tested gt 0 that means entire string is a number;
  if anydigit(next4) gt 0 and anyalpha(next4) =0 then got_number = 'y';
run;

proc print data=righttest;
run;

View solution in original post

7 REPLIES 7
Bal23
Lapis Lazuli | Level 10
if upcase(substr(icd, 1, 1)) in ( 'E' ) and substr(icd, 2,4) in ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then i9E="Y"; 

 if upcase(substr(icd, 1, 1)) in ( 'V' ) and substr(icd, 2,4) in ( '0','1', '2', '3', '4', '5', '6', '7', '8', '9') then i9E="Y";
else i9v="N";
if substr(icd, 1,3) in ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then i9n="Y";else i9n="N";

my code does not work. Can anybody tell what the problem is? Thanks.

 

ballardw
Super User

You are comparing 4 long character values from using substr(icd,2,4) with a single character.

 

Try

notdigit(strip(substr(icd,2,4)))= 0 then i9e='Y'

 

Cynthia_sas
Diamond | Level 26

Hi:
  To work with your IF statements, somebody has to
1) make up some FAKE data and those who do not know ICD9 or ICD10 will make up the wrong type of FAKE data -- so you could help by posting an example of some data.
 
2) after they make up the FAKE data, then they have to make up a program -- you could help by supplying more of the program than just IF statements.

SAS has several functions that might help you here:
ANYDIGIT and ANYALPHA also NOTDIGIT and NOTALPHA
Check out the examples in the documentation:
http://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#p1wmqu81yx6jwjn1tdh...
using ANYALPHA and ANYDIGIT for testing might be easier than doing what you're doing.

For example if you extract a substr of 4 characters like '2345', it will NOT match single digits such as you show, which is probably why your IF is not working.

cynthia

 

Here's a program to try that illustrates a better use of ANYALPHA and ANYDIGIT:

data wrongtest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';

  next4 = substr(code,2,4);
  if next4 in ('0','1', '2', '3', '4', '5', '6', '7', '8', '9') then got_number = 'y';
run;
 
proc print data=wrongtest;
run;
  
data righttest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';
 
  next4 = substr(code,2,4);
  ** if anyalpha = 0 means no alpha characters in string;
  ** so if anydigit is tested gt 0 that means entire string is a number;
  if anydigit(next4) gt 0 and anyalpha(next4) =0 then got_number = 'y';
run;

proc print data=righttest;
run;
Tom
Super User Tom
Super User

What are you trying to do?

Are you attempting to check if a particular string has the valid format for an ICD code?

Perhaps regular expressions are the way to go.

Astounding
PROC Star

Here's a short way to check for "E" or "V":

 

if upcase(icd) in : ('E', 'V')

 

To check for a number:

 

if input(icd, ??8.) > 0

 

Combined, they would be:

 

if upcase(icd) in : ('E', 'V') or input(icd, ??8.) > . then ....

 

Do you also need to check the fourth character for a decimal point?  If so, would that apply in every case or just for the E/V or just for the non-E/V values?

Bal23
Lapis Lazuli | Level 10

if input(icd, ??8.) > 0

what is ?? stand for? Should i replace it with what?

yes, I want to check for a decimal point, for the one starts with E, that should be the fourth position that possiblily has ".", for one with "V", that should be the third position that has "."

and there might be no "." then the values stops there, icd-9 code can be 3-5 characters, No2-5 is numeric

Astounding
PROC Star

No substitutions needed in ??

 

When the INPUT function attempts to read something as numeric, it would issue a message when a non-numeric is found.  The ?? suppresses dozens and dozens of messages about finding an invalid numeric value.

 

This should be getting closer, if not all the way there:

 

if   ((upcase(icd) =: 'V' and (length(icd) < 4 or substr(icd,4, 1)='.'))

 

or   (upcase(icd) =: 'E' and (length(icd) < 3 or substr(icd,3, 1)='.') )

 

and   (input(substr(icd,2), ??5.) > .) then .....

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2504 views
  • 1 like
  • 5 in conversation