BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello,

 

My dataset looks like this:

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PCF01 AAA30
2 AC003 PL000 TAC25
3 QC000 CAB50 FCE10
4 MA100 CA500 DE100
;
run;

 

I would  like to flag (1 or 0) every row if they have a code that starts with two letters.

 

data want;
length id $10 dcode $48;
input id$ dcode$ & $let_flag;
datalines;
1 MCB10 PCF01 AAA30 0
2 AC003 PL000 TAC25 1
3 QC000 CAB50 FCE10 1
4 MA100 CA500 DE100 1
;
run;

 

Thanks!

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

If I'm understanding you, you want flag=1 if any of the codes on the record start with exactly 2 letters (followed by numeric digits). This regular expression checks for that match (and whether it occurs at the start of a record or after a space).

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PCF01 AAA30
2 AC003 PL000 TAC25
3 QC000 CAB50 FCE10
4 MA100 CA500 DE100
;
run;
data want;
 set have;
 flag = (prxmatch('/(^|\s)[A-Z][A-Z][0-9]/',dcode)>0);
run;

 

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

So, you have multiple codes in your Dcode Variable, right?

 

Do all of them have to start with exactly 2 letters or at least 2 letters?

 

Why is let_flag = 0 in the first obs?

Chris_LK_87
Quartz | Level 8
They should start with exatcly two letters. The first obs has let_flag=0 because neither of the codes in the first row starts exactly two letters.
ballardw
Super User

"I would like to flag (1 or 0) every row if they have a code that starts with two letters."

Define "a code". As I look at your data you apparently have 3 values stuck into a single variable. In a very large number of cases this is very poor data structuring so I can't tell if you want the "starts with two letters" to mean the long value with multiple spaces as a single code or each of the pieces separated by spaces to be a code.

 

Exactly 2?  Of ALL the groups or just any one?

 

 

Chris_LK_87
Quartz | Level 8

Yes, the codes are stored in one variable. The codes always contain five digits, they can start with two letters or three letters. I would like to select the codes that starts with two letters.

Chris_LK_87
Quartz | Level 8

If we for example think that each code is stored in separate variables, example:

dcode1 dcode2 dcode3 dcode4 dcode5 etc. Then I could use an array.

Chris_LK_87
Quartz | Level 8

Comment to post. It should be start with exactly two letters.

ChrisHemedinger
Community Manager

If I'm understanding you, you want flag=1 if any of the codes on the record start with exactly 2 letters (followed by numeric digits). This regular expression checks for that match (and whether it occurs at the start of a record or after a space).

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 MCB10 PCF01 AAA30
2 AC003 PL000 TAC25
3 QC000 CAB50 FCE10
4 MA100 CA500 DE100
;
run;
data want;
 set have;
 flag = (prxmatch('/(^|\s)[A-Z][A-Z][0-9]/',dcode)>0);
run;

 

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!
ChrisNZ
Tourmaline | Level 20

Or

FLAG= (prxmatch('/[A-Z][A-Z][^A-Z]/',DCODE)=1);

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2157 views
  • 4 likes
  • 5 in conversation