Hello,
I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.
Data want;
set have;
id$ dcode1$ dcode2$ dcode3$
1 MCB10 AAA30
2 TAC25
3 CAB50 FCE10
1. Use COUNTW() to count number of words
2. Extract each word
3. Check if first 3 characters are alphabetic (NOTALPHA())
4. Output if valid record
5. Transpose to a wide structure as desired
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
data long;
set have;
nwords=countw(dcode); /*1*/
do i=1 to nwords;
word=scan(dcode, i); /*2*/
if not notalpha(substr(word, 1, 3)) then /*3*/
output; /*4*/
end;
run;
/*5*/
proc transpose data=long out=want prefix=WORD_;
by ID;
var word;
run;
@Chris_LK_87 wrote:
Hello,
I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.
Data want;
set have;
id$ dcode1$ dcode2$ dcode3$
1 MCB10 AAA30
2 TAC25
3 CAB50 FCE10
1. Use COUNTW() to count number of words
2. Extract each word
3. Check if first 3 characters are alphabetic (NOTALPHA())
4. Output if valid record
5. Transpose to a wide structure as desired
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
data long;
set have;
nwords=countw(dcode); /*1*/
do i=1 to nwords;
word=scan(dcode, i); /*2*/
if not notalpha(substr(word, 1, 3)) then /*3*/
output; /*4*/
end;
run;
/*5*/
proc transpose data=long out=want prefix=WORD_;
by ID;
var word;
run;
@Chris_LK_87 wrote:
Hello,
I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.
Data want;
set have;
id$ dcode1$ dcode2$ dcode3$
1 MCB10 AAA30
2 TAC25
3 CAB50 FCE10
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.