BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 964 views
  • 0 likes
  • 2 in conversation