BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello,

 

My dataset looks like this:

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PCF01 AAA30
2 AC003 PL000 TAC25
3 QC000 CAB50 FCE10
4 MA100 CA500 DE100
;
run;

 

I would  like to flag (1 or 0) every row if they have a code that starts with two letters.

 

data want;
length id $10 dcode $48;
input id$ dcode$ & $let_flag;
datalines;
1 MCB10 PCF01 AAA30 0
2 AC003 PL000 TAC25 1
3 QC000 CAB50 FCE10 1
4 MA100 CA500 DE100 1
;
run;

 

Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

If I'm understanding you, you want flag=1 if any of the codes on the record start with exactly 2 letters (followed by numeric digits). This regular expression checks for that match (and whether it occurs at the start of a record or after a space).

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PCF01 AAA30
2 AC003 PL000 TAC25
3 QC000 CAB50 FCE10
4 MA100 CA500 DE100
;
run;
data want;
 set have;
 flag = (prxmatch('/(^|\s)[A-Z][A-Z][0-9]/',dcode)>0);
run;

 

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

So, you have multiple codes in your Dcode Variable, right?

 

Do all of them have to start with exactly 2 letters or at least 2 letters?

 

Why is let_flag = 0 in the first obs?

Chris_LK_87
Quartz | Level 8
They should start with exatcly two letters. The first obs has let_flag=0 because neither of the codes in the first row starts exactly two letters.
ballardw
Super User

"I would like to flag (1 or 0) every row if they have a code that starts with two letters."

Define "a code". As I look at your data you apparently have 3 values stuck into a single variable. In a very large number of cases this is very poor data structuring so I can't tell if you want the "starts with two letters" to mean the long value with multiple spaces as a single code or each of the pieces separated by spaces to be a code.

 

Exactly 2?  Of ALL the groups or just any one?

 

 

Chris_LK_87
Quartz | Level 8

Yes, the codes are stored in one variable. The codes always contain five digits, they can start with two letters or three letters. I would like to select the codes that starts with two letters.

Chris_LK_87
Quartz | Level 8

If we for example think that each code is stored in separate variables, example:

dcode1 dcode2 dcode3 dcode4 dcode5 etc. Then I could use an array.

Chris_LK_87
Quartz | Level 8

Comment to post. It should be start with exactly two letters.

ChrisHemedinger
Community Manager

If I'm understanding you, you want flag=1 if any of the codes on the record start with exactly 2 letters (followed by numeric digits). This regular expression checks for that match (and whether it occurs at the start of a record or after a space).

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PCF01 AAA30
2 AC003 PL000 TAC25
3 QC000 CAB50 FCE10
4 MA100 CA500 DE100
;
run;
data want;
 set have;
 flag = (prxmatch('/(^|\s)[A-Z][A-Z][0-9]/',dcode)>0);
run;

 

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!
ChrisNZ
Tourmaline | Level 20

Or

FLAG= (prxmatch('/[A-Z][A-Z][^A-Z]/',DCODE)=1);

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2132 views
  • 4 likes
  • 5 in conversation