Hello,
I am trying to compute a new variable based on the character values in the 6th and 7th position of an existing column. For example, I'd like to try to compute the column Newcode from CCIcode:
PatientID CCIcode Newcode
1 1GR89QB 1GR89O
2 1AB76EA 1AB76E
3 1CG57KZ 1CG57O
4 1CG57KS 1CG57E
Essentially, if position 6 & 7 of the variable CCIcode are AA-KS, then I'd like position 6 of Newcode to be "E". If position 6 & 7 of the variable CCIcode are KZ-XY (there is nothing inbetween KS to KZ) then I'd like position 6 of Newcode to be "O".
I've tried a few variations of substr code, but I don't know how to specify the character ranges without having to type the letters individually. Any help here would be much much appreciated!
Thanks!
How is the range AA-KS defined?
How is the range AA-KS defined?
data have;
input PatientID CCIcode :$15.;
cards;
1 1GR89QB 1GR89O
2 1AB76EA 1AB76E
3 1CG57KZ 1CG57O
4 1CG57KS 1CG57E
;
data want;
set have;
k=substr(CCIcode,6,2);
if 'AA'<=k<='KS' then newcode=cats(substr(CCIcode,1,5),'E');
else if 'KZ'<=k<='XY' then newcode=cats(substr(CCIcode,1,5),'O');
drop k;
run;
Thank-you so much - this is exactly what I was looking for! Works perfectly!
You can also use a PRX function to look for specific patterns:
data want;
set have;
if prxmatch('/^.{5}[A-J][A-Z]$|^.{5}K[A-S]$/',strip(CCIcode))
then Newcode = prxchange('s/[A-J][A-Z]$|K[A-S]$/E/',1,strip(CCIcode));
else if prxmatch('/^.{5}KZ$|^.{5}[L-W][A-Z]$|^.{5}X[A-Y]$/',strip(CCIcode))
then Newcode = prxchange('s/KZ$|[L-W][A-Z]$|X[A-Y]$/O/',1,strip(CCIcode));
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.