BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zqkal
Obsidian | Level 7

Perl regular expression in SAS.

Does anyone knows how I could achieve this using perl regular expressions in SAS.

I have this SAS dataset.

DATA given;

input poolno $ name $;

  datalines;

  00581591 FN

  MSR0581591 GN

  0581591 FN

  A581591A58591 GN

  0581591A FN

  0581591BB GN

  05815910 FN

  '0581591 GN

   ;

run;

I want to apply below rules using SAS regular expressions. 

IF LENGTH(POOLNO) > 6  then do the following.

  1. Remove all leading zeros
  2. Remove 'MRS0' prefix if it exist
  3. If a space exists, only keep the portion of the string prior to the space and not including that space Example: ABC 123 -> ABC
  4. If the value exists twice, keep only one of them.  Example: ABC123ABC123 -> ABC123
  5. If the 7th nonzero character is alphabetical then only keep the 6 nonzero characters prior

# to that alphabetical character and not including it. Example: 123456A -> 123456

  1. Remove two trailing characters if they exist Example: 123456BB -> 123456
  2. Remove trailing zeroes after the sixth character Example: 12345600 -> 123456
  3. i. Remove leading '  Example: '123456 -> 123456

This is the desired sas dataset:

0581591 FN

585191  GN

0581591 FN

A581591 GN

0581591 FN

0581591 GN

0581591 FN

0581591 GN

Thanks in advance for you kind response.

1 ACCEPTED SOLUTION

Accepted Solutions
FriedEgg
SAS Employee

DATA given;

input ( poolno name ) (: $20.);

newpoolno=prxchange('s#([''\b]|MSR0)?(\w\d{4,5}[1-9])[0\w]{0,}\b#$2#',-1,prxchange('s#^00#0#',1,poolno));

  datalines;

  00581591 FN

  MSR0581591 GN

  0581591 FN

  A581591A58591 GN

  0581591A FN

  0581591BB GN

  05815910 FN

  '0581591 GN

   ;

run;

An additional note, taken in priority, your rules and results do not mesh perfectly.  I coded to match the results provided rather than then rules you listed.

poolnonamenewpoolno
00581591FN0581591
MSR0581591GN581591
0581591FN0581591
A581591A58591GNA581591
0581591AFN0581591
0581591BBGN0581591
05815910FN0581591
'0581591GN0581591

View solution in original post

2 REPLIES 2
FriedEgg
SAS Employee

DATA given;

input ( poolno name ) (: $20.);

newpoolno=prxchange('s#([''\b]|MSR0)?(\w\d{4,5}[1-9])[0\w]{0,}\b#$2#',-1,prxchange('s#^00#0#',1,poolno));

  datalines;

  00581591 FN

  MSR0581591 GN

  0581591 FN

  A581591A58591 GN

  0581591A FN

  0581591BB GN

  05815910 FN

  '0581591 GN

   ;

run;

An additional note, taken in priority, your rules and results do not mesh perfectly.  I coded to match the results provided rather than then rules you listed.

poolnonamenewpoolno
00581591FN0581591
MSR0581591GN581591
0581591FN0581591
A581591A58591GNA581591
0581591AFN0581591
0581591BBGN0581591
05815910FN0581591
'0581591GN0581591
zqkal
Obsidian | Level 7

sorry ,I made a mistake when writing the desired sas data set but i have what i needed.

Thanks for your help.


sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 868 views
  • 1 like
  • 2 in conversation