BookmarkSubscribeRSS Feed
femiajumobi1
Quartz | Level 8

I will like to search for any of these alphanumeric codes ( J45.10, J45.20,J45.30) in a column titled (Diagnoses) which has text including different alphanumeric codes, and then assign their existence to categorical variable (AI). But my codes return errors. 

For  example, a cell in column "diagnoses" have  text such as "J45.10 throat ache, uncomplicated ; S81.011A Laceration without foreign body, right knee, initial encounter ; S50.311A Abrasion of right elbow, initial encounter "

Here is my code:

 

Data want;

if diagnoses in ('J45.10', 'J45.20', 'J45.30') then AI=1;

else AI=0;

run;

 

Thanks for your help.

5 REPLIES 5
antonbcristina
SAS Employee

Unfortunately using the IN operator here won't work since SAS interprets this statement:  

if diagnoses in ('J45.10', 'J45.20', 'J45.30') then AI=1;

 as:

if diagnoses ='J45.10' or diagnoses ='J45.20' or diagnoses ='J45.30' then AI=1;

Based on the example you provided for the values of Diagnoses, none of these conditions would ever result to True since they are much longer strings. 

 

What you could use instead is the FIND function which would allow you to search for a substring of characters within the variable Diagnoses like so: 

if find(diagnoses,"J45.10",'i')>0 then AI=1;

Since the FIND fuction will return the starting position of the substring within the string, you're looking for a value >0 to confirm whether it is present. 

femiajumobi1
Quartz | Level 8

@antonbcristina your suggestion returned missing for all observations.  The code did not work.

 

J45 codes are in a range J45.10 - J45.90 and I would like to find them in each cell within column "diagnoses" and output them as AI=1.

 

Thanks.

Patrick
Opal | Level 21

You can use the IN operator with the colon modifier to search for a list of substrings like J45.10 in your string stored in variable Diagnoses. You will need to list all the substrings explicitly.

 

Alternatively use the findw() function and search for J45. This of course will only work if there aren't any cases where you have J45 in the string but don't want to return true - let's say you wouldn't want true (AI=1) for J45.991

 

If below two coding alternatives don't return what you need then please provide sample data that include cases where things aren't working yet (=amend below data have step), explain the required logic and provide the desired outcome when using the sample data.

data have;
  infile datalines truncover;
  input diagnoses $200.;
  datalines4;
J45.10 throat ache, uncomplicated ; S81.011A Laceration without foreign body, right knee, initial encounter ; S50.311A Abrasion of right elbow, initial encounter 
X45.10 throat ache, uncomplicated ; S81.011A Laceration without foreign body, right knee, initial encounter ; S50.311A Abrasion of right elbow, initial encounter 
;;;;

data want;
  set have;
  if diagnoses in: ('J45.10', 'J45.20', 'J45.30') then AI=1;
  else AI=0;
run;

proc print data=want(keep=ai);
run;

data want2;
  set have;
  A1= findw(diagnoses,'J45');
run;

proc print data=want2;
run;
femiajumobi1
Quartz | Level 8

@Patrick 

 

This found the variables but ascribed zero to all of them 

data want;
  set have;
  if diagnoses in: ('J45.10', 'J45.20', 'J45.30') then AI=1;
  else AI=0;
run;

 

I switched it with AI =0 if any of the variables are found (see below) and it correctly ascribed 1 to all of them. It's weird! I want to ascribe AI=1 in any cell it finds 'J45.10', 'J45.20', 'J45.30' and AI=0 otherwise

 

I have over 14,000 observations with  a column 'diagnoses' which has text information beyond these alphanumeric variables: 'J45.10', 'J45.20', 'J45.30' and I wish to find where this exist in each diagnoses cell. Thanks

data want;
  set have;
  if diagnoses in: ('J45.10', 'J45.20', 'J45.30') then AI=0;
  else AI=1;
run;

 

Patrick
Opal | Level 21

@femiajumobi1 wrote:

@Patrick 

 

This found the variables but ascribed zero to all of them 

data want;
  set have;
  if diagnoses in: ('J45.10', 'J45.20', 'J45.30') then AI=1;
  else AI=0;
run;

 

I switched it with AI =0 if any of the variables are found (see below) and it correctly ascribed 1 to all of them. It's weird! I want to ascribe AI=1 in any cell it finds 'J45.10', 'J45.20', 'J45.30' and AI=0 otherwise

 


@femiajumobi1 

I can't replicate what you describe. Please provide representative sample data!

Just extend below HAVE table with additional sample data where things aren't working as expected and show us the desired result.

 

Below code return AI=1 for strings where one of the substring in the IN clause are found.

data have;
  infile datalines truncover;
  input diagnoses $200.;
  datalines4;
J45.10 throat ache, uncomplicated ; S81.011A Laceration without foreign body, right knee, initial encounter ; S50.311A Abrasion of right elbow, initial encounter 
X45.10 throat ache, uncomplicated ; S81.011A Laceration without foreign body, right knee, initial encounter ; S50.311A Abrasion of right elbow, initial encounter 
;;;;

data want;
  set have;
  if diagnoses in: ('J45.10', 'J45.20', 'J45.30') then AI=1;
  else AI=0;
run;

proc print data=want(keep=ai);
run;

Patrick_0-1697324969048.png

 

I have over 14,000 observations with  a column 'diagnoses' which has text information beyond these alphanumeric variables: 'J45.10', 'J45.20', 'J45.30' and I wish to find

If the number of ICD codes you want to search for becomes too big then some approach using a lookup table might be better suited. But we can't know if you don't tell us.

 

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 690 views
  • 0 likes
  • 3 in conversation