BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 865 views
  • 0 likes
  • 2 in conversation