Solved: Select codes that begins with tree alphabetic characters

Chris_LK_87 · Posted 05-04-2021 10:41 AM

Hello,

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.

Data want;

set have;

id$ dcode1$ dcode2$ dcode3$

1 MCB10 AAA30

2 TAC25

3 CAB50 FCE10

Reeza · Posted 05-04-2021 11:10 AM

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.

Data want;

set have;

id$ dcode1$ dcode2$ dcode3$

1 MCB10 AAA30

2 TAC25

3 CAB50 FCE10

View solution in original post

Reeza · Posted 05-04-2021 11:10 AM

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable.

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2 AC003 PA000 TAC25
3 QC000 CAB50 FCE10
;
run;

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable.

Data want;

set have;

id$ dcode1$ dcode2$ dcode3$

1 MCB10 AAA30

2 TAC25

3 CAB50 FCE10

Chris_LK_87 · Posted 05-06-2021 03:26 AM

Thanks!

Select codes that begins with tree alphabetic characters

Re: Select codes that begins with tree alphabetic characters

Re: Select codes that begins with tree alphabetic characters

Re: Select codes that begins with tree alphabetic characters

Catch up on SAS Innovate 2026

Select codes that begins with tree alphabetic characters

Re: Select codes that begins with tree alphabetic characters

Re: Select codes that begins with tree alphabetic characters

Re: Select codes that begins with tree alphabetic characters

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away