SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RAGC
Obsidian | Level 7

Hi, 

 

I am working on a large dataset that lists all follow-up conditions in a range of columns using ICD10 codes. ICD10 codes thankfully have a similar pattern where there is a letter and then a range of number for similar disorders (e.g. Cancer is C000 to C969). I want to go through each column and create a new variable if the person has had one of those types of disorders. I have provided an example below. Disease_X_0 is the column with the ICD10 code, and Cancer is the new variable I would like to create.   

 

IDDisease_1_0Disease_2_0Disease_3_0Disease_4_0Cancer
1C001B005  1
2D55 C97 1
3 C97  1
4K00   0
5 D57  0

 

I feel like there is an easy way to do this but cannot seem to find it with all my googling! Thank you so much in advance for your time! 

1 ACCEPTED SOLUTION

Accepted Solutions
ed_sas_member
Meteorite | Level 14

you’re welcome !
You can mention a range as follows in the prxmatch function:
/C[0-500]/

If the ICD code must have 3 digits following the C letter, you can add this check :
AND prxmatch(‘/C\d\d\d/‘ ...)

if prxmatch('/C\d\d\d/',_Disease(i))>0 and prxmatch('/C[0-500]/',_Disease(i))>0  then cancer=1;

View solution in original post

10 REPLIES 10
RAGC
Obsidian | Level 7

Hi, 

 

I am working on a large dataset that lists all follow-up conditions in a range of columns using ICD10 codes. ICD10 codes thankfully have a similar pattern where there is a letter and then a range of number for similar disorders (e.g. Cancer is C000 to C969). I want to go through each column and create a new variable if the person has had one of those types of disorders. I have provided an example below. Disease_X_0 is the column with the ICD10 code, and Cancer is the new variable I would like to create.   

 

IDDisease_1_0Disease_2_0Disease_3_0Disease_4_0Cancer
1C001B005  1
2D55 C97 1
3 C97  1
4K00   0
5 D57  0

 

I feel like there is an easy way to do this but cannot seem to find it with all my googling! Thank you so much in advance for your time! 

ed_sas_member
Meteorite | Level 14

Hi @RAGC 

You can try the below code. Please avoid duplicate posts 😉

The array function enables to perform the same manipulation on multiple variables.

Hope this helps!

Best,

data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C001,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;

data want;
	set have;
	array _Disease (*) Disease_:;
	cancer=0;
	do i=1 to dim(_Disease);
		if prxmatch('/C\d+/',_Disease(i))>0 then cancer=1; /* \d+ mean 1 digit or more */
	end;
	drop i;
run;

 

 

ed_sas_member
Meteorite | Level 14

Hi @RAGC 

You can try the below code.

The array function enables to perform the same manipulation on multiple variables.

Hope this helps!

Best,

data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C001,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;

data want;
	set have;
	array _Disease (*) Disease_:;
	cancer=0;
	do i=1 to dim(_Disease);
		if prxmatch('/C\d+/',_Disease(i))>0 then cancer=1; /* \d+ mean 1 digit or more */
	end;
	drop i;
run;

Capture d’écran 2020-05-02 à 20.34.14.png

RAGC
Obsidian | Level 7

Hi @ed_sas_member,

 

My sincerest apologies for the duplicate posting! I realized I posted in the wrong section, and thought I had deleted it before posting it in here (which I hope is the appropriate section). 

 

Your code worked super well! Thank you so much! If you would indulge me for one step further that I am struggling with. I see that you mentioned the d+ is for anything greater than 1 digit. If there a way to specify a range of the values (e.g. C001 to C500) even though there are up to C969 available characters?.

 

Best,
Rebecca

ed_sas_member
Meteorite | Level 14

you’re welcome !
You can mention a range as follows in the prxmatch function:
/C[0-500]/

If the ICD code must have 3 digits following the C letter, you can add this check :
AND prxmatch(‘/C\d\d\d/‘ ...)

if prxmatch('/C\d\d\d/',_Disease(i))>0 and prxmatch('/C[0-500]/',_Disease(i))>0  then cancer=1;
RAGC
Obsidian | Level 7

Thanks @ed_sas_member  for all your help!

 

Thanks @Kurt_Bremser  for the merge.

 

Best,
Rebecca

ed_sas_member
Meteorite | Level 14
You’re welcome Rebecca!
Have a wonderful day,
Best,
novinosrin
Tourmaline | Level 20

Sorry @RAGC  Late to the party. Anyways, for what it's worth


data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C001,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;


data want;
 set have;
 array c(969) _temporary_  (1:969);
 array t Disease_1_0--Disease_4_0;
 do _n_=1 to dim(t) until(cancer=1);
  cancer= first(t(_n_))='C' and input(compress(t(_n_),,'kd'),best.) in c;
 end;
run;

proc print noobs;run;

ID Disease_1_0 Disease_2_0 Disease_3_0 Disease_4_0 cancer
1 C001 B005     1
2 D55   C97   1
3   C97     1
4 K00       0
5   D57     0
novinosrin
Tourmaline | Level 20

Oops!, I took the list as 1:969 where I missed to start from 0 i.e. 0:969. 

Correction:


data have;
	infile datalines dlm="," dsd missover;
	input ID Disease_1_0 $ Disease_2_0 $ Disease_3_0 $ Disease_4_0 $;
	datalines;
1,C000,B005,,
2,D55, ,C97,
3,,C97,,,
4,K00,,,,
5,,D57,,,
;
run;


data want;
 set have;
 array c(000:969) _temporary_  (0:969);
 array t Disease_1_0--Disease_4_0;
 do _n_=1 to dim(t) until(cancer=1);
  cancer= first(t(_n_))='C' and input(compress(t(_n_),,'kd'),best.) in c;
 end;
run;

proc print noobs;run;

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2340 views
  • 2 likes
  • 4 in conversation