I have a patient unique ID, say "CT00001A" and I want to separate the prefix "CT" from the serial number "00001" and the suffix "A". Here is a program I used
DATA TEST;
SET sasuser.CIHP_RADET;
B = COMPRESS (A,"","A");
C = COMPRESS (A,B); RUN;
PROC PRINT DATA = TEST;
RUN;
and it returned 00001 in one column and CTA in another. Now I want to separate CT from A in 2 separate columns
Please try this:
data new;
infile datalines;
input x $15.;
k1 = anydigit (x);
A = substr (x, 1, k1-1);
temp = substr(x,k1);
k2 = anyalpha (temp);
Num = substr(temp, 1, k2-1);
B = substr(temp, k2);
keep x A Num B;
datalines;
CT00010A
AMOPI00066
CT00069DV
CTPM00069B
OPEI00079DV
;
What are the rules?
Why can't you just substring the first two characters using SUBSTR()?
Obviously this will need to be used on multiple observations so is it always the first two characters? Or first set of letters? How do you know A needs to be removed? What if it was "CT000B1A" what would be kept or is that not possible?
Alright, here is a better explanation.
I have attached 2 SAS OUTPUT pictures and 2 Excel pictures of the dataset, the variables "prefix", " serial number" and "suffix" were meant to concatenate as "Patient Unique ID", but a data clerk erroneously copied the wrong thing.
The A or B after "CT00001A" is the suffix.
Thank you.
Your rules apparently expect the original value to have a alphabetic prefix, a numerical middle and an alphabetic suffix, in which each component can have a varying length. You can use SCAN to get the prefix and suffix, when you declare all digits as delimitrs. And the middle is nothing but the original string with all alphabetic characters removed (i.e. compressed out):
data _null_;
string='CTX00001A';
str_prefix=scan(string,1,,'d');
str_middle=compress(string,,'A');
str_suffix=scan(string,2,,'d');
put (str:) (=);
run;
Many thanks, but I am actually trying this on a variable with several observations. Here is a short copy of it:
The SAS System |
Obs | A |
10 | CT00010A |
54 | AMOPI00066 |
55 | CT00069DV |
56 | CTPM00069B |
67 | OPEI00079DV |
68 | PM00080A |
69 | PM00080C |
70 | TI00080A |
71 | PM00080b |
72 | CT00081B |
73 | CT00081A |
75 | PM00084A |
76 | TI00084B |
77 | CT00085A |
78 | CT00085B |
79 | PM00086A |
80 | PM00086B |
81 | PM00088A |
82 | PM00088B |
88 | PM00092A |
89 | PM00092B |
101 | TI00117B |
102 | TI00117A |
179 | PM02414 |
200 | PI03049A |
201 | PI03049B |
@Ken_nuga wrote:
Many thanks, but I am actually trying this on a variable with several observations. Here is a short copy of it:
It doesn't matter how many observations, the code is the same. You need to reference your data set and your variables, but the idea is correct.
Please try this:
data new;
infile datalines;
input x $15.;
k1 = anydigit (x);
A = substr (x, 1, k1-1);
temp = substr(x,k1);
k2 = anyalpha (temp);
Num = substr(temp, 1, k2-1);
B = substr(temp, k2);
keep x A Num B;
datalines;
CT00010A
AMOPI00066
CT00069DV
CTPM00069B
OPEI00079DV
;
Alright, it is done! Here is what I did and a partial print out.
proc import out = sasuser.CIHP_RADET datafile = 'C:\Users\ONUGHA\Desktop\A.xlsx' dbms=xlsx replace;
run;
DATA TEST;
SET sasuser.CIHP_RADET;
k1 = anydigit (A);
PUID_PREFIX = substr (A, 1, k1-1);
temp = substr(A,k1);
k2 = anyalpha (temp);
PUID = substr(temp, 1, k2-1);
PUID_SUFFIX = substr(temp, k2);
keep A PUID_PREFIX PUID PUID_SUFFIX;
RUN;
PROC PRINT DATA = TEST; RUN;
The SAS System |
Obs | A | PUID_PREFIX | PUID | PUID_SUFFIX |
1 | PM00003DV | PM | 00003 | DV |
2 | CT00004MSG | CT | 00004 | MSG |
3 | OPI00004AMOPI | OPI | 00004 | AMOPI |
4 | CT/PI00005 | CT/PI | 00005 |
|
5 | CCP/PM00006 | CCP/PM | 00006 |
|
6 | CT00007 | CT | 00007 |
|
7 | PM00008 | PM | 00008 |
|
8 | OPI00008 | OPI | 00008 |
|
9 | PM00009 | PM | 00009 |
|
10 | CT00010A | CT | 00010 | A |
11 | PM00010 | PM | 00010 |
|
Great, thank you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.