BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 662 views
  • 0 likes
  • 2 in conversation