Hello,
I am trying to compute a new variable based on the character values in the 6th and 7th position of an existing column. For example, I'd like to try to compute the column Newcode from CCIcode:
PatientID CCIcode Newcode
1 1GR89QB 1GR89O
2 1AB76EA 1AB76E
3 1CG57KZ 1CG57O
4 1CG57KS 1CG57E
Essentially, if position 6 & 7 of the variable CCIcode are AA-KS, then I'd like position 6 of Newcode to be "E". If position 6 & 7 of the variable CCIcode are KZ-XY (there is nothing inbetween KS to KZ) then I'd like position 6 of Newcode to be "O".
I've tried a few variations of substr code, but I don't know how to specify the character ranges without having to type the letters individually. Any help here would be much much appreciated!
Thanks!
How is the range AA-KS defined?
How is the range AA-KS defined?
data have;
input PatientID CCIcode :$15.;
cards;
1 1GR89QB 1GR89O
2 1AB76EA 1AB76E
3 1CG57KZ 1CG57O
4 1CG57KS 1CG57E
;
data want;
set have;
k=substr(CCIcode,6,2);
if 'AA'<=k<='KS' then newcode=cats(substr(CCIcode,1,5),'E');
else if 'KZ'<=k<='XY' then newcode=cats(substr(CCIcode,1,5),'O');
drop k;
run;
Thank-you so much - this is exactly what I was looking for! Works perfectly!
You can also use a PRX function to look for specific patterns:
data want;
set have;
if prxmatch('/^.{5}[A-J][A-Z]$|^.{5}K[A-S]$/',strip(CCIcode))
then Newcode = prxchange('s/[A-J][A-Z]$|K[A-S]$/E/',1,strip(CCIcode));
else if prxmatch('/^.{5}KZ$|^.{5}[L-W][A-Z]$|^.{5}X[A-Y]$/',strip(CCIcode))
then Newcode = prxchange('s/KZ$|[L-W][A-Z]$|X[A-Y]$/O/',1,strip(CCIcode));
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.