For those either contain icd9 or icd10 i want to develop a program that is
if
if upcase(substr(icd, 1, 1)) in ( 'E' ) and substr(icd, 4, 1)) in ('.") then
also I need to find, other than "V", "E", then the value for ICd9 should be numerical, \
Any advice on how to do it?
Thanks.
Hi:
To work with your IF statements, somebody has to
1) make up some FAKE data and those who do not know ICD9 or ICD10 will make up the wrong type of FAKE data -- so you could help by posting an example of some data.
2) after they make up the FAKE data, then they have to make up a program -- you could help by supplying more of the program than just IF statements.
SAS has several functions that might help you here:
ANYDIGIT and ANYALPHA also NOTDIGIT and NOTALPHA
Check out the examples in the documentation:
http://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#p1wmqu81yx6jwjn1tdh...
using ANYALPHA and ANYDIGIT for testing might be easier than doing what you're doing.
For example if you extract a substr of 4 characters like '2345', it will NOT match single digits such as you show, which is probably why your IF is not working.
cynthia
Here's a program to try that illustrates a better use of ANYALPHA and ANYDIGIT:
data wrongtest;
got_e_v = 'n';
got_number = 'n';
code='E12345';
firstchar = substr(code,1,1);
if firstchar in ('E', 'V') then got_e_v = 'y';
next4 = substr(code,2,4);
if next4 in ('0','1', '2', '3', '4', '5', '6', '7', '8', '9') then got_number = 'y';
run;
proc print data=wrongtest;
run;
data righttest;
got_e_v = 'n';
got_number = 'n';
code='E12345';
firstchar = substr(code,1,1);
if firstchar in ('E', 'V') then got_e_v = 'y';
next4 = substr(code,2,4);
** if anyalpha = 0 means no alpha characters in string;
** so if anydigit is tested gt 0 that means entire string is a number;
if anydigit(next4) gt 0 and anyalpha(next4) =0 then got_number = 'y';
run;
proc print data=righttest;
run;
if upcase(substr(icd, 1, 1)) in ( 'E' ) and substr(icd, 2,4) in ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then i9E="Y";
if upcase(substr(icd, 1, 1)) in ( 'V' ) and substr(icd, 2,4) in ( '0','1', '2', '3', '4', '5', '6', '7', '8', '9') then i9E="Y";
else i9v="N";
if substr(icd, 1,3) in ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9') then i9n="Y";else i9n="N";
my code does not work. Can anybody tell what the problem is? Thanks.
You are comparing 4 long character values from using substr(icd,2,4) with a single character.
Try
notdigit(strip(substr(icd,2,4)))= 0 then i9e='Y'
Hi:
To work with your IF statements, somebody has to
1) make up some FAKE data and those who do not know ICD9 or ICD10 will make up the wrong type of FAKE data -- so you could help by posting an example of some data.
2) after they make up the FAKE data, then they have to make up a program -- you could help by supplying more of the program than just IF statements.
SAS has several functions that might help you here:
ANYDIGIT and ANYALPHA also NOTDIGIT and NOTALPHA
Check out the examples in the documentation:
http://support.sas.com/documentation/cdl/en/ds2ref/68052/HTML/default/viewer.htm#p1wmqu81yx6jwjn1tdh...
using ANYALPHA and ANYDIGIT for testing might be easier than doing what you're doing.
For example if you extract a substr of 4 characters like '2345', it will NOT match single digits such as you show, which is probably why your IF is not working.
cynthia
Here's a program to try that illustrates a better use of ANYALPHA and ANYDIGIT:
data wrongtest;
got_e_v = 'n';
got_number = 'n';
code='E12345';
firstchar = substr(code,1,1);
if firstchar in ('E', 'V') then got_e_v = 'y';
next4 = substr(code,2,4);
if next4 in ('0','1', '2', '3', '4', '5', '6', '7', '8', '9') then got_number = 'y';
run;
proc print data=wrongtest;
run;
data righttest;
got_e_v = 'n';
got_number = 'n';
code='E12345';
firstchar = substr(code,1,1);
if firstchar in ('E', 'V') then got_e_v = 'y';
next4 = substr(code,2,4);
** if anyalpha = 0 means no alpha characters in string;
** so if anydigit is tested gt 0 that means entire string is a number;
if anydigit(next4) gt 0 and anyalpha(next4) =0 then got_number = 'y';
run;
proc print data=righttest;
run;
What are you trying to do?
Are you attempting to check if a particular string has the valid format for an ICD code?
Perhaps regular expressions are the way to go.
Here's a short way to check for "E" or "V":
if upcase(icd) in : ('E', 'V')
To check for a number:
if input(icd, ??8.) > 0
Combined, they would be:
if upcase(icd) in : ('E', 'V') or input(icd, ??8.) > . then ....
Do you also need to check the fourth character for a decimal point? If so, would that apply in every case or just for the E/V or just for the non-E/V values?
if input(icd, ??8.) > 0
what is ?? stand for? Should i replace it with what?
yes, I want to check for a decimal point, for the one starts with E, that should be the fourth position that possiblily has ".", for one with "V", that should be the third position that has "."
and there might be no "." then the values stops there, icd-9 code can be 3-5 characters, No2-5 is numeric
No substitutions needed in ??
When the INPUT function attempts to read something as numeric, it would issue a message when a non-numeric is found. The ?? suppresses dozens and dozens of messages about finding an invalid numeric value.
This should be getting closer, if not all the way there:
if ((upcase(icd) =: 'V' and (length(icd) < 4 or substr(icd,4, 1)='.'))
or (upcase(icd) =: 'E' and (length(icd) < 3 or substr(icd,3, 1)='.') )
and (input(substr(icd,2), ??5.) > .) then .....
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.