Hello,
I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.
Data want;
set have;
id$ dcode1$ dcode2$ dcode3$
1 MCB10 AAA30
2 TAC25
3 CAB50 FCE10
1. Use COUNTW() to count number of words
2. Extract each word
3. Check if first 3 characters are alphabetic (NOTALPHA())
4. Output if valid record
5. Transpose to a wide structure as desired
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
data long;
set have;
nwords=countw(dcode); /*1*/
do i=1 to nwords;
word=scan(dcode, i); /*2*/
if not notalpha(substr(word, 1, 3)) then /*3*/
output; /*4*/
end;
run;
/*5*/
proc transpose data=long out=want prefix=WORD_;
by ID;
var word;
run;
@Chris_LK_87 wrote:
Hello,
I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.
Data want;
set have;
id$ dcode1$ dcode2$ dcode3$
1 MCB10 AAA30
2 TAC25
3 CAB50 FCE10
1. Use COUNTW() to count number of words
2. Extract each word
3. Check if first 3 characters are alphabetic (NOTALPHA())
4. Output if valid record
5. Transpose to a wide structure as desired
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
data long;
set have;
nwords=countw(dcode); /*1*/
do i=1 to nwords;
word=scan(dcode, i); /*2*/
if not notalpha(substr(word, 1, 3)) then /*3*/
output; /*4*/
end;
run;
/*5*/
proc transpose data=long out=want prefix=WORD_;
by ID;
var word;
run;
@Chris_LK_87 wrote:
Hello,
I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;
I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.
Data want;
set have;
id$ dcode1$ dcode2$ dcode3$
1 MCB10 AAA30
2 TAC25
3 CAB50 FCE10
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.