BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RAGC
Obsidian | Level 7

Hi, 

 

I am working on a large dataset that lists all follow-up conditions in a range of columns using ICD10 codes. ICD10 codes thankfully have a similar pattern where there is a letter and then a range of number for similar disorders (e.g. Cancer is C000 to C969). I want to go through each column and create a new variable if the person has had one of those types of disorders. I have provided an example below. Disease_X_0 is the column with the ICD10 code, and Cancer is the new variable I would like to create.   

 

IDDisease_1_0Disease_2_0Disease_3_0Disease_4_0Cancer
1C001B005  1
2D55 C97 1
3 C97  1
4K00   0
5 D57  0

 

I feel like there is an easy way to do this but cannot seem to find it with all my googling! Thank you so much in advance for your time! 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

you’re welcome !
You can mention a range as follows in the prxmatch function:
/C[0-500]/

If the ICD code must have 3 digits following the C letter, you can add this check :
AND prxmatch(‘/C\d\d\d/‘ ...)

if prxmatch('/C\d\d\d/',_Disease(i))>0 and prxmatch('/C[0-500]/',_Disease(i))>0  then cancer=1;

View solution in original post

10 REPLIES 10
RAGC
Obsidian | Level 7

Hi, 

 

I am working on a large dataset that lists all follow-up conditions in a range of columns using ICD10 codes. ICD10 codes thankfully have a similar pattern where there is a letter and then a range of number for similar disorders (e.g. Cancer is C000 to C969). I want to go through each column and create a new variable if the person has had one of those types of disorders. I have provided an example below. Disease_X_0 is the column with the ICD10 code, and Cancer is the new variable I would like to create.   

 

IDDisease_1_0Disease_2_0Disease_3_0Disease_4_0Cancer
1C001B005  1
2D55 C97 1
3 C97  1
4K00   0
5 D57  0

 

I feel like there is an easy way to do this but cannot seem to find it with all my googling! Thank you so much in advance for your time! 

ed_sas_member
Meteorite | Level 14

Hi @RAGC 

You can try the below code. Please avoid duplicate posts 😉

The array function enables to perform the same manipulation on multiple variables.

Hope this helps!

Best,

data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C001,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;

data want;
	set have;
	array _Disease (*) Disease_:;
	cancer=0;
	do i=1 to dim(_Disease);
		if prxmatch('/C\d+/',_Disease(i))>0 then cancer=1; /* \d+ mean 1 digit or more */
	end;
	drop i;
run;

 

 

ed_sas_member
Meteorite | Level 14

Hi @RAGC 

You can try the below code.

The array function enables to perform the same manipulation on multiple variables.

Hope this helps!

Best,

data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C001,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;

data want;
	set have;
	array _Disease (*) Disease_:;
	cancer=0;
	do i=1 to dim(_Disease);
		if prxmatch('/C\d+/',_Disease(i))>0 then cancer=1; /* \d+ mean 1 digit or more */
	end;
	drop i;
run;

Capture d’écran 2020-05-02 à 20.34.14.png

RAGC
Obsidian | Level 7

Hi @ed_sas_member,

 

My sincerest apologies for the duplicate posting! I realized I posted in the wrong section, and thought I had deleted it before posting it in here (which I hope is the appropriate section). 

 

Your code worked super well! Thank you so much! If you would indulge me for one step further that I am struggling with. I see that you mentioned the d+ is for anything greater than 1 digit. If there a way to specify a range of the values (e.g. C001 to C500) even though there are up to C969 available characters?.

 

Best,
Rebecca

ed_sas_member
Meteorite | Level 14

you’re welcome !
You can mention a range as follows in the prxmatch function:
/C[0-500]/

If the ICD code must have 3 digits following the C letter, you can add this check :
AND prxmatch(‘/C\d\d\d/‘ ...)

if prxmatch('/C\d\d\d/',_Disease(i))>0 and prxmatch('/C[0-500]/',_Disease(i))>0  then cancer=1;
RAGC
Obsidian | Level 7

Thanks @ed_sas_member  for all your help!

 

Thanks @Kurt_Bremser  for the merge.

 

Best,
Rebecca

ed_sas_member
Meteorite | Level 14
You’re welcome Rebecca!
Have a wonderful day,
Best,
novinosrin
Tourmaline | Level 20

Sorry @RAGC  Late to the party. Anyways, for what it's worth


data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C001,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;


data want;
 set have;
 array c(969) _temporary_  (1:969);
 array t Disease_1_0--Disease_4_0;
 do _n_=1 to dim(t) until(cancer=1);
  cancer= first(t(_n_))='C' and input(compress(t(_n_),,'kd'),best.) in c;
 end;
run;

proc print noobs;run;

ID Disease_1_0 Disease_2_0 Disease_3_0 Disease_4_0 cancer
1 C001 B005     1
2 D55   C97   1
3   C97     1
4 K00       0
5   D57     0
novinosrin
Tourmaline | Level 20

Oops!, I took the list as 1:969 where I missed to start from 0 i.e. 0:969. 

Correction:


data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C000,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;


data want;
 set have;
 array c(000:969) _temporary_  (0:969);
 array t Disease_1_0--Disease_4_0;
 do _n_=1 to dim(t) until(cancer=1);
  cancer= first(t(_n_))='C' and input(compress(t(_n_),,'kd'),best.) in c;
 end;
run;

proc print noobs;run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2849 views
  • 2 likes
  • 4 in conversation