BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Quartz | Level 8

## Select codes that begins with tree alphabetic characters

Hello,

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.

data have;
length id \$10 dcode \$48;
input id\$ dcode\$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.

Data want;

set have;

id\$ dcode1\$ dcode2\$ dcode3\$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Select codes that begins with tree alphabetic characters

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

``````data have;
length id \$10 dcode \$48;
input id\$ dcode\$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
set have;
nwords=countw(dcode); /*1*/

do i=1 to nwords;
word=scan(dcode, i); /*2*/

if not notalpha(substr(word, 1, 3)) then /*3*/
output; /*4*/
end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
by ID;
var word;
run;``````

@Chris_LK_87 wrote:

Hello,

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.

data have;
length id \$10 dcode \$48;
input id\$ dcode\$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.

Data want;

set have;

id\$ dcode1\$ dcode2\$ dcode3\$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

2 REPLIES 2
Super User

## Re: Select codes that begins with tree alphabetic characters

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

``````data have;
length id \$10 dcode \$48;
input id\$ dcode\$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
set have;
nwords=countw(dcode); /*1*/

do i=1 to nwords;
word=scan(dcode, i); /*2*/

if not notalpha(substr(word, 1, 3)) then /*3*/
output; /*4*/
end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
by ID;
var word;
run;``````

@Chris_LK_87 wrote:

Hello,

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.

data have;
length id \$10 dcode \$48;
input id\$ dcode\$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.

Data want;

set have;

id\$ dcode1\$ dcode2\$ dcode3\$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

Quartz | Level 8

## Re: Select codes that begins with tree alphabetic characters

Thanks!
Discussion stats
• 2 replies
• 390 views
• 0 likes
• 2 in conversation