BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bretthouston
Obsidian | Level 7

Hello,

 

I am trying to compute a new variable based on the character values in the 6th and 7th position of an existing column. For example, I'd like to try to compute the column Newcode from CCIcode:

 

PatientID    CCIcode     Newcode

1                1GR89QB    1GR89O

2                1AB76EA     1AB76E

3                1CG57KZ    1CG57O

4                1CG57KS    1CG57E

 

Essentially, if position 6 & 7 of the variable CCIcode are AA-KS, then I'd like position 6 of Newcode to be "E". If position 6 & 7 of the variable CCIcode are KZ-XY (there is nothing inbetween KS to KZ) then I'd like position 6 of Newcode to be "O".

 

I've tried a few variations of substr code, but I don't know how to specify the character ranges without having to type the letters individually. Any help here would be much much appreciated!

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

How is the range AA-KS defined?

novinosrin
Tourmaline | Level 20

data have;
input PatientID   CCIcode :$15.;
cards;
1                1GR89QB    1GR89O
2                1AB76EA     1AB76E
3                1CG57KZ    1CG57O
4                1CG57KS    1CG57E
;

data want;
set have;
k=substr(CCIcode,6,2);
if  'AA'<=k<='KS' then newcode=cats(substr(CCIcode,1,5),'E');
else if 'KZ'<=k<='XY' then newcode=cats(substr(CCIcode,1,5),'O');
drop k;
run;
bretthouston
Obsidian | Level 7

Thank-you so much - this is exactly what I was looking for! Works perfectly!

ed_sas_member
Meteorite | Level 14

Hi @bretthouston 

 

You can also use a PRX function to look for specific patterns:

data want;
	set have;
	if prxmatch('/^.{5}[A-J][A-Z]$|^.{5}K[A-S]$/',strip(CCIcode))
		then Newcode = prxchange('s/[A-J][A-Z]$|K[A-S]$/E/',1,strip(CCIcode));
	else if prxmatch('/^.{5}KZ$|^.{5}[L-W][A-Z]$|^.{5}X[A-Y]$/',strip(CCIcode))
		then Newcode = prxchange('s/KZ$|[L-W][A-Z]$|X[A-Y]$/O/',1,strip(CCIcode));
run;