DATA Step, Macro, Functions and more

how can I find the "." from those values

Accepted Solution Solved
Reply
Super Contributor
Posts: 345
Accepted Solution

how can I find the "." from those values

For those either contain icd9 or icd10 i want to develop a program that is

 

if

if upcase(substr(icd, 1, 1)) in ( 'E' )  and substr(icd, 4, 1)) in ('.") then

also I need to find, other than "V", "E", then the value for ICd9 should be numerical, \

 

Any advice on how to do it?

Thanks.


Accepted Solutions
Solution
‎07-18-2016 05:36 PM
SAS Super FREQ
Posts: 8,868

Re: how can I find the "." from those values

Hi:
  To work with your IF statements, somebody has to
1) make up some FAKE data and those who do not know ICD9 or ICD10 will make up the wrong type of FAKE data -- so you could help by posting an example of some data.
 
2) after they make up the FAKE data, then they have to make up a program -- you could help by supplying more of the program than just IF statements.

SAS has several functions that might help you here:
ANYDIGIT and ANYALPHA also NOTDIGIT and NOTALPHA
Check out the examples in the documentation:
http://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#p1wmqu81yx6jwjn1tdh...
using ANYALPHA and ANYDIGIT for testing might be easier than doing what you're doing.

For example if you extract a substr of 4 characters like '2345', it will NOT match single digits such as you show, which is probably why your IF is not working.

cynthia

 

Here's a program to try that illustrates a better use of ANYALPHA and ANYDIGIT:

data wrongtest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';

  next4 = substr(code,2,4);
  if next4 in ('0','1', '2', '3', '4', '5', '6', '7', '8', '9') then got_number = 'y';
run;
 
proc print data=wrongtest;
run;
  
data righttest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';
 
  next4 = substr(code,2,4);
  ** if anyalpha = 0 means no alpha characters in string;
  ** so if anydigit is tested gt 0 that means entire string is a number;
  if anydigit(next4) gt 0 and anyalpha(next4) =0 then got_number = 'y';
run;

proc print data=righttest;
run;

View solution in original post


All Replies
Super Contributor
Posts: 345

Re: how can I find the "." from those values

if upcase(substr(icd, 1, 1)) in ( 'E' ) and substr(icd, 2,4) in ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then i9E="Y"; 

 if upcase(substr(icd, 1, 1)) in ( 'V' ) and substr(icd, 2,4) in ( '0','1', '2', '3', '4', '5', '6', '7', '8', '9') then i9E="Y";
else i9v="N";
if substr(icd, 1,3) in ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then i9n="Y";else i9n="N";

my code does not work. Can anybody tell what the problem is? Thanks.

 

Super User
Posts: 11,343

Re: how can I find the "." from those values

You are comparing 4 long character values from using substr(icd,2,4) with a single character.

 

Try

notdigit(strip(substr(icd,2,4)))= 0 then i9e='Y'

 

Solution
‎07-18-2016 05:36 PM
SAS Super FREQ
Posts: 8,868

Re: how can I find the "." from those values

Hi:
  To work with your IF statements, somebody has to
1) make up some FAKE data and those who do not know ICD9 or ICD10 will make up the wrong type of FAKE data -- so you could help by posting an example of some data.
 
2) after they make up the FAKE data, then they have to make up a program -- you could help by supplying more of the program than just IF statements.

SAS has several functions that might help you here:
ANYDIGIT and ANYALPHA also NOTDIGIT and NOTALPHA
Check out the examples in the documentation:
http://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#p1wmqu81yx6jwjn1tdh...
using ANYALPHA and ANYDIGIT for testing might be easier than doing what you're doing.

For example if you extract a substr of 4 characters like '2345', it will NOT match single digits such as you show, which is probably why your IF is not working.

cynthia

 

Here's a program to try that illustrates a better use of ANYALPHA and ANYDIGIT:

data wrongtest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';

  next4 = substr(code,2,4);
  if next4 in ('0','1', '2', '3', '4', '5', '6', '7', '8', '9') then got_number = 'y';
run;
 
proc print data=wrongtest;
run;
  
data righttest;
  got_e_v = 'n';
  got_number = 'n';
  code='E12345';
  firstchar = substr(code,1,1);
  if firstchar  in ('E', 'V') then got_e_v = 'y';
 
  next4 = substr(code,2,4);
  ** if anyalpha = 0 means no alpha characters in string;
  ** so if anydigit is tested gt 0 that means entire string is a number;
  if anydigit(next4) gt 0 and anyalpha(next4) =0 then got_number = 'y';
run;

proc print data=righttest;
run;
Super User
Super User
Posts: 7,078

Re: how can I find the "." from those values

What are you trying to do?

Are you attempting to check if a particular string has the valid format for an ICD code?

Perhaps regular expressions are the way to go.

Super User
Posts: 5,518

Re: how can I find the "." from those values

Here's a short way to check for "E" or "V":

 

if upcase(icd) in : ('E', 'V')

 

To check for a number:

 

if input(icd, ??8.) > 0

 

Combined, they would be:

 

if upcase(icd) in : ('E', 'V') or input(icd, ??8.) > . then ....

 

Do you also need to check the fourth character for a decimal point?  If so, would that apply in every case or just for the E/V or just for the non-E/V values?

Super Contributor
Posts: 345

Re: how can I find the "." from those values

Posted in reply to Astounding

if input(icd, ??8.) > 0

what is ?? stand for? Should i replace it with what?

yes, I want to check for a decimal point, for the one starts with E, that should be the fourth position that possiblily has ".", for one with "V", that should be the third position that has "."

and there might be no "." then the values stops there, icd-9 code can be 3-5 characters, No2-5 is numeric

Super User
Posts: 5,518

Re: how can I find the "." from those values

No substitutions needed in ??

 

When the INPUT function attempts to read something as numeric, it would issue a message when a non-numeric is found.  The ?? suppresses dozens and dozens of messages about finding an invalid numeric value.

 

This should be getting closer, if not all the way there:

 

if   ((upcase(icd) =: 'V' and (length(icd) < 4 or substr(icd,4, 1)='.'))

 

or   (upcase(icd) =: 'E' and (length(icd) < 3 or substr(icd,3, 1)='.') )

 

and   (input(substr(icd,2), ??5.) > .) then .....

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 318 views
  • 1 like
  • 5 in conversation