BookmarkSubscribeRSS Feed
sks521
Quartz | Level 8

Hi folks,

 

I have variable with ICD-10 codes. I want to group them as below;

 

'A000'-'A099'='Intestinal_infection'
'A150'-'A199'='TB'
'A200'-'A280'='Zoonotic'
'A300'-'A499'='Other_bacterial'
'A500'-'A640'='STI'
'A650'-'A699'='Spirochaetal'
'A800'-'A890'='Viral_CNS'
'A900'-'A990'='Viral-fever'
'B000'-'B099'='Viral_skin'
'B150'-'B199'='Viral_hep'
'B200'-'B240'='HIV'
'B250'-'B349'='Other_viral'
'B350'-'B490'='Mycoses'
'B500'-'B640'='Protozoa'
'B650'-'B839'='Helminthiases'
'B850'-'B890'='Other_infestation'
'B900'-'B949'='Sequelae'
'B950'-'B990'= 'Other'
'C000'-'C970'= 'Neoplasms'
'D000'-'D489'='Other_neoplasms'
'D500'-'D899'='Blood_disorder'
'E000'-'E899'='Endocrine'
'F000'-'F990'='Mental_disorder'
'G000'-'G998'='Nervous'
'H000'-'H599'='Eye_adnexa'
'H600'-'H959'='Ear_mastoid'
'1000'-'1990'='Circulatory'
'J000'-'J998'='Respiratory'
'K000'-'K938'='Digestive'
'L000'-'L998'='Subcutanous'
'M000'-'M999'='MSK'
'N000'-'N999'='Genitourinary'
'O000'-'O998'='Pregnancy'
'P000'-'P969'='Perinatal'
'Q000'-'Q999'='Congenital'
'R000'-'R999'='Clinical_laboratory'
'S0000'-'S9999'='Injuries'
'T0000'-'T983'='Multiple_injuries'
'V010'-'V990'='Transport_accident'
'W000'-'W990'='Other_accidents'
'X000'-'X840'='Other_injuries'
'Y100'-'Y989'='Sequelae'
'Z000'-'Z999'='Potential_hazard'
other='Ungrouped'

 

What's the best way of indexing them in these groups? I have used 'Format' syntax but this is not how I want it to be.

 

Ta

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Just to be clear, you do not want to do this with formats?

sks521
Quartz | Level 8

No, because when I do it with formats, I can't create new variables using these categories. 

 For e.g. I used this syntax and tried to create another variable 'Regroup' from old variable 'New_code' in the next data step, 'Regroup prints out blank.

*indexing IDC-10 codes/grouping them in main categories;
proc format;
value $ICD
'A000'-'A099'='Intestinal_infection'
'A150'-'A199'='TB'
'A200'-'A280'='Zoonotic'
'A300'-'A499'='Other_bacterial'
'A500'-'A640'='STI'
'A650'-'A699'='Spirochaetal'
'A800'-'A890'='Viral_CNS'
'A900'-'A990'='Viral-fever'
'B000'-'B099'='Viral_skin'
'B150'-'B199'='Viral_hep'
'B200'-'B240'='HIV'
'B250'-'B349'='Other_viral'
'B350'-'B490'='Mycoses'
'B500'-'B640'='Protozoa'
'B650'-'B839'='Helminthiases'
'B850'-'B890'='Other_infestation'
'B900'-'B949'='Sequelae'
'B950'-'B990'= 'Other'
'C000'-'C970'= 'Neoplasms'
'D000'-'D489'='Other_neoplasms'
'D500'-'D899'='Blood_disorder'
'E000'-'E899'='Endocrine'
'F000'-'F990'='Mental_disorder'
'G000'-'G998'='Nervous'
'H000'-'H599'='Eye_adnexa'
'H600'-'H959'='Ear_mastoid'
'1000'-'1990'='Circulatory'
'J000'-'J998'='Respiratory'
'K000'-'K938'='Digestive'
'L000'-'L998'='Subcutanous'
'M000'-'M999'='MSK'
'N000'-'N999'='Genitourinary'
'O000'-'O998'='Pregnancy'
'P000'-'P969'='Perinatal'
'Q000'-'Q999'='Congenital'
'R000'-'R999'='Clinical_laboratory'
'S0000'-'S9999'='Injuries'
'T0000'-'T983'='Multiple_injuries'
'V010'-'V990'='Transport_accident'
'W000'-'W990'='Other_accidents'
'X000'-'X840'='Other_injuries'
'Y100'-'Y989'='Sequelae'
'Z000'-'Z999'='Potential_hazard'
other='Ungrouped'
;
run;
data ICD10;
set specialty;
New_code=diagnosisprimaryicd;
format New_code ICD.;
run;
*Regrouping ICD codes into main groups on WHO website here https://icd.who.int/browse10/2016/en;
Data recode_ICD;
set ICD10;
if New_code=' ' then Regroup=' ';
else if New_code in ('Intestinal_infection','TB','Zoonotic','Other_bacterial','STI','Spirochaetal','Viral_CNS','Viral-fever','Viral_skin','Viral_hep','HIV','Other_viral',
'Mycoses','Protozoa','Helminthiases','Other_infestation','Sequelae', 'Other') then Regroup='Infections';
else if New_code in ('Neoplasms','Other_neoplasms') then Regroup='Neoplasm';
else if New_code in ('Blood_disorder') then Regroup='Blood_dis';
else if New_code in ('Endocrine') then Regroup='Endocrine';
else if New_code in ('Mental_disorder') then Regroup='Mental_dis';
else if New_code in ('Nervous') then Regroup='Nervous';
else if New_code in ('Eye_adnexa') then Regroup='Adnexa';
else if New_code in ('Ear_mastoid') then Regroup='Ear_mast';
else if New_code in ('Circulatory') then Regroup='Circulatory';
else if New_code in ('Respiratory') then Regroup='Respiration';
else if New_code in ('Digestive') then Regroup='Digestive';
else if New_code in ('Subcutanous') then Regroup='Subcutanous';
else if New_code in ('MSK') then Regroup='MSK';
else if New_code in ('Genitourinary') then Regroup='Genitourinary';
else if New_code in ('Pregnancy') then Regroup='Pregnancy';
else if New_code in ('Perinatal') then Regroup='Perinatal';
else if New_code in ('Congenital') then Regroup='Congenital';
else if New_code in ('Clinical_laboratory') then Regroup='Clinical_laboratory';
else if New_code in ('Injuries','Multiple_injuries','Transport_accident','Other_accidents','Other_injuries','Sequelae') then Regroup='Injuries';
else if New_code in ('Potential_hazard','Ungrouped') then Regroup='Other';
run;
unison
Lapis Lazuli | Level 10

Formatting new_code will just change the appearance of diagnosisprimaryicd but the underlying value is still the same. So when you try to regroup, nothing happens because the underlying value is still there. Instead, you can define new_code in the following way and your logic should work.

 

Try

new_code=put(diagnosisprimaryicd,$icd.);

 

-unison
sks521
Quartz | Level 8

Whereabouts in the code do I put this please?

 

unison
Lapis Lazuli | Level 10
data ICD10;
	set specialty;
	New_code=put(diagnosisprimaryicd, $icd.);

	if New_code=' ' then
		Regroup=' ';
	else if New_code in ('Intestinal_infection', 'TB', 'Zoonotic', 
		'Other_bacterial', 'STI', 'Spirochaetal', 'Viral_CNS', 'Viral-fever', 
		'Viral_skin', 'Viral_hep', 'HIV', 'Other_viral', 'Mycoses', 'Protozoa', 
		'Helminthiases', 'Other_infestation', 'Sequelae', 'Other') then
			Regroup='Infections';
	else if New_code in ('Neoplasms', 'Other_neoplasms') then
		Regroup='Neoplasm';
	else if New_code in ('Blood_disorder') then
		Regroup='Blood_dis';
	else if New_code in ('Endocrine') then
		Regroup='Endocrine';
	else if New_code in ('Mental_disorder') then
		Regroup='Mental_dis';
	else if New_code in ('Nervous') then
		Regroup='Nervous';
	else if New_code in ('Eye_adnexa') then
		Regroup='Adnexa';
	else if New_code in ('Ear_mastoid') then
		Regroup='Ear_mast';
	else if New_code in ('Circulatory') then
		Regroup='Circulatory';
	else if New_code in ('Respiratory') then
		Regroup='Respiration';
	else if New_code in ('Digestive') then
		Regroup='Digestive';
	else if New_code in ('Subcutanous') then
		Regroup='Subcutanous';
	else if New_code in ('MSK') then
		Regroup='MSK';
	else if New_code in ('Genitourinary') then
		Regroup='Genitourinary';
	else if New_code in ('Pregnancy') then
		Regroup='Pregnancy';
	else if New_code in ('Perinatal') then
		Regroup='Perinatal';
	else if New_code in ('Congenital') then
		Regroup='Congenital';
	else if New_code in ('Clinical_laboratory') then
		Regroup='Clinical_laboratory';
	else if New_code in ('Injuries', 'Multiple_injuries', 'Transport_accident', 
		'Other_accidents', 'Other_injuries', 'Sequelae') then
			Regroup='Injuries';
	else if New_code in ('Potential_hazard', 'Ungrouped') then
		Regroup='Other';
run;
-unison
sks521
Quartz | Level 8

Still prints out blank!

unison
Lapis Lazuli | Level 10

Without more info about what you're data looks like, I can't really help you. The reasons it could be printing blank are uncountable. What I've done here works. See if you can adapt to how you need.

proc format cntlout=icdgrpfmt(keep=fmtname start end label hlo 
		where=(fmtname='ICDGRP'));
	value $icdgrp 'A000'-'A099'='Intestinal_infection' 'A150'-'A199'='TB' 
		'A200'-'A280'='Zoonotic' 'A300'-'A499'='Other_bacterial' 'A500'-'A640'='STI' 
		'A650'-'A699'='Spirochaetal' 'A800'-'A890'='Viral_CNS' 
		'A900'-'A990'='Viral-fever' 'B000'-'B099'='Viral_skin' 
		'B150'-'B199'='Viral_hep' 'B200'-'B240'='HIV' 'B250'-'B349'='Other_viral' 
		'B350'-'B490'='Mycoses' 'B500'-'B640'='Protozoa' 
		'B650'-'B839'='Helminthiases' 'B850'-'B890'='Other_infestation' 
		'B900'-'B949', 'Y100'-'Y989'='Sequelae' 'B950'-'B990'='Other' 
		'C000'-'C970'='Neoplasms' 'D000'-'D489'='Other_neoplasms' 
		'D500'-'D899'='Blood_disorder' 'E000'-'E899'='Endocrine' 
		'F000'-'F990'='Mental_disorder' 'G000'-'G998'='Nervous' 
		'H000'-'H599'='Eye_adnexa' 'H600'-'H959'='Ear_mastoid' 
		'1000'-'1990'='Circulatory' 'J000'-'J998'='Respiratory' 
		'K000'-'K938'='Digestive' 'L000'-'L998'='Subcutanous' 'M000'-'M999'='MSK' 
		'N000'-'N999'='Genitourinary' 'O000'-'O998'='Pregnancy' 
		'P000'-'P969'='Perinatal' 'Q000'-'Q999'='Congenital' 
		'R000'-'R999'='Clinical_laboratory' 'S0000'-'S9999'='Injuries' 
		'T0000'-'T983'='Multiple_injuries' 'V010'-'V990'='Transport_accident' 
		'W000'-'W990'='Other_accidents' 'X000'-'X840'='Other_injuries' 
		'Z000'-'Z999'='Potential_hazard' other='Ungrouped';
run;

data icdregrp;
	retain fmtname '$rgrp' type 'c';
	set icdgrpfmt(drop=start end fmtname rename=label=old_label);
	start=old_label;
	length label $50.;

	if start=' ' then
		label=' ';
	else if start in ('Intestinal_infection', 'TB', 'Zoonotic', 'Other_bacterial', 
		'STI', 'Spirochaetal', 'Viral_CNS', 'Viral-fever', 'Viral_skin', 'Viral_hep', 
		'HIV', 'Other_viral', 'Mycoses', 'Protozoa', 'Helminthiases', 
		'Other_infestation', 'Sequelae', 'Other') then
			label='Infections';
	else if start in ('Neoplasms', 'Other_neoplasms') then
		label='Neoplasm';
	else if start in ('Blood_disorder') then
		label='Blood_dis';
	else if start in ('Endocrine') then
		label='Endocrine';
	else if start in ('Mental_disorder') then
		label='Mental_dis';
	else if start in ('Nervous') then
		label='Nervous';
	else if start in ('Eye_adnexa') then
		label='Adnexa';
	else if start in ('Ear_mastoid') then
		label='Ear_mast';
	else if start in ('Circulatory') then
		label='Circulatory';
	else if start in ('Respiratory') then
		label='Respiration';
	else if start in ('Digestive') then
		label='Digestive';
	else if start in ('Subcutanous') then
		label='Subcutanous';
	else if start in ('MSK') then
		label='MSK';
	else if start in ('Genitourinary') then
		label='Genitourinary';
	else if start in ('Pregnancy') then
		label='Pregnancy';
	else if start in ('Perinatal') then
		label='Perinatal';
	else if start in ('Congenital') then
		label='Congenital';
	else if start in ('Clinical_laboratory') then
		label='Clinical_laboratory';
	else if start in ('Injuries', 'Multiple_injuries', 'Transport_accident', 
		'Other_accidents', 'Other_injuries') then
			label='Injuries';
	else if start in ('Potential_hazard', 'Ungrouped') then
		label='Other';
	drop old_label;
run;

proc sort data=icdregrp nodupkey out=regrp;
	by start;
run;

proc format cntlin=regrp;
run;

data have;
	input icd_cd $;
	datalines;
A096
B955
;
run;

data want;
	set have;
	icd_grp_fmt=put(icd_cd, $icdgrp.);
	icd_regrp_fmt=put(icd_grp_fmt, $rgrp.);
run;
-unison
ballardw
Super User

@sks521 wrote:

No, because when I do it with formats, I can't create new variables using these categories. 

 


 

Incorrect:

 

Newvar = put(variable, formatname.);

will create a new variable assuming the existence of the variable and the format.

 

Note: Comparisons with character values and ranges is an extremely problematic activity in general.

 

Example:

data example;
  x = 'A12';
  if 'A100' le x le 'A150' then put x= " is between A100 and A150";
run;

Run the code, read the log and decide if that is the desired result. The character comparisons stop at the end the shortest variable

Reeza
Super User

@sks521 wrote:

No, because when I do it with formats, I can't create new variables using these categories. 

 For e.g. I used this syntax and tried to create another variable 'Regroup' from old variable 'New_code' in the next data step, 'Regroup prints out blank.

You're not using the formats correctly.

To apply a format to a new variable and access the formatted value you need to use PUT().

New_ICD_Code = put(diagnosisPrimaryICD, $ICD.);

Then use your IF/THEN statements from your last step. Or create a second format that does the groupings as well and then you have:

Regroup = put(New_ICD_Code, $regroup_fmt.);

 

unison
Lapis Lazuli | Level 10

It's not clear what you mean by 'this is not how I want it to be'..

 

Proc format is definitely the way to go. Now if you'd rather have an associated index (like 2) instead or an associated character label like 'Zoonotic', then you can try the following:

proc format cntlout=icdgrpfmt(keep=start end label hlo);
	value $icdgrp 'A000'-'A099'='Intestinal_infection' 'A150'-'A199'='TB' 
		'A200'-'A280'='Zoonotic' 'A300'-'A499'='Other_bacterial' 'A500'-'A640'='STI' 
		'A650'-'A699'='Spirochaetal' 'A800'-'A890'='Viral_CNS' 
		'A900'-'A990'='Viral-fever' 'B000'-'B099'='Viral_skin' 
		'B150'-'B199'='Viral_hep' 'B200'-'B240'='HIV' 'B250'-'B349'='Other_viral' 
		'B350'-'B490'='Mycoses' 'B500'-'B640'='Protozoa' 
		'B650'-'B839'='Helminthiases' 'B850'-'B890'='Other_infestation' 
		'B900'-'B949'='Sequelae' 'B950'-'B990'='Other' 'C000'-'C970'='Neoplasms' 
		'D000'-'D489'='Other_neoplasms' 'D500'-'D899'='Blood_disorder' 
		'E000'-'E899'='Endocrine' 'F000'-'F990'='Mental_disorder' 
		'G000'-'G998'='Nervous' 'H000'-'H599'='Eye_adnexa' 
		'H600'-'H959'='Ear_mastoid' '1000'-'1990'='Circulatory' 
		'J000'-'J998'='Respiratory' 'K000'-'K938'='Digestive' 
		'L000'-'L998'='Subcutanous' 'M000'-'M999'='MSK' 'N000'-'N999'='Genitourinary' 
		'O000'-'O998'='Pregnancy' 'P000'-'P969'='Perinatal' 
		'Q000'-'Q999'='Congenital' 'R000'-'R999'='Clinical_laboratory' 
		'S0000'-'S9999'='Injuries' 'T0000'-'T983'='Multiple_injuries' 
		'V010'-'V990'='Transport_accident' 'W000'-'W990'='Other_accidents' 
		'X000'-'X840'='Other_injuries' 'Y100'-'Y989'='Sequelae' 
		'Z000'-'Z999'='Potential_hazard' other='Ungrouped';
run;

data icdgrpidx;
	retain fmtname 'icdidx' type 'i';
	set icdgrpfmt;
	label=_n_;
run;

proc format cntlin=icdgrpidx;
run;

data have;
	input icd_cd $;
	datalines;
A096
B955
;
run;

data want;
	set have;
	icd_grp_fmt=put(icd_cd, $icdgrp.);
	icd_idx_fmt=input(icd_cd, icdidx.);
run;
-unison
sks521
Quartz | Level 8

Because when I try to create another variable for example if I want to combine all those as one category 'Infections', I am not able to do that or do you have any suggestion?

'A000'-'A099'='Intestinal_infection' 'A150'-'A199'='TB' 
		'A200'-'A280'='Zoonotic' 'A300'-'A499'='Other_bacterial' 'A500'-'A640'='STI' 
		'A650'-'A699'='Spirochaetal' 'A800'-'A890'='Viral_CNS' 
		'A900'-'A990'='Viral-fever' 'B000'-'B099'='Viral_skin' 
		'B150'-'B199'='Viral_hep' 'B200'-'B240'='HIV' 'B250'-'B349'='Other_viral' 
		'B350'-'B490'='Mycoses' 'B500'-'B640'='Protozoa' 
		'B650'-'B839'='Helminthiases' 'B850'-'B890'='Other_infestation'  

 

ballardw
Super User

@sks521 wrote:

Because when I try to create another variable for example if I want to combine all those as one category 'Infections', I am not able to do that or do you have any suggestion?

'A000'-'A099'='Intestinal_infection' 'A150'-'A199'='TB' 
		'A200'-'A280'='Zoonotic' 'A300'-'A499'='Other_bacterial' 'A500'-'A640'='STI' 
		'A650'-'A699'='Spirochaetal' 'A800'-'A890'='Viral_CNS' 
		'A900'-'A990'='Viral-fever' 'B000'-'B099'='Viral_skin' 
		'B150'-'B199'='Viral_hep' 'B200'-'B240'='HIV' 'B250'-'B349'='Other_viral' 
		'B350'-'B490'='Mycoses' 'B500'-'B640'='Protozoa' 
		'B650'-'B839'='Helminthiases' 'B850'-'B890'='Other_infestation'  

 


You really need to provide some example data and the desired result.

If you want a result of INFECTIONS then you want a different format.

The same values can be mapped to more than value using multiple formats.

For example I have formats that assign ZIP code to various service regions besides the more typical City, State or County. Change the format to the one I want for a specific purpose.

 

And for some reports using the procedures that support MULTILABEL Formats I have formats that will use a Site identifier to produce summaries by Site as well as group of sites (region, agency  or what have you).

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1358 views
  • 0 likes
  • 5 in conversation