BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

View solution in original post

2 REPLIES 2
Reeza
Super User

1. Use COUNTW() to count number of words

2. Extract each word

3. Check if first 3 characters are alphabetic (NOTALPHA())

4. Output if valid record

5. Transpose to a wide structure as desired

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
	datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

data long;
	set have;
	nwords=countw(dcode); /*1*/

	do i=1 to nwords;
		word=scan(dcode, i); /*2*/

		if not notalpha(substr(word, 1, 3)) then /*3*/
			output; /*4*/
	end;
run;

/*5*/
proc transpose data=long out=want prefix=WORD_;
	by ID;
	var word;
run;

@Chris_LK_87 wrote:

Hello,

 

I have dataset that looks like this. Each line contains an individual and codes attached to that individual. The codes are stored with spaces in one variable. 

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PC001 AAA30
2  AC003 PA000 TAC25
3  QC000 CAB50 FCE10
;
run;

 

I would like to select every code that begins with tree alphabetic characters and place each code in a new variable. 

 

Data want; 

set have;

id$ dcode1$ dcode2$ dcode3$

1  MCB10             AAA30

2                          TAC25

3              CAB50 FCE10

 


 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 533 views
  • 0 likes
  • 2 in conversation