BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zqkal
Obsidian | Level 7

Perl regular expression in SAS.

Does anyone knows how I could achieve this using perl regular expressions in SAS.

I have this SAS dataset.

DATA given;

input poolno $ name $;

  datalines;

  00581591 FN

  MSR0581591 GN

  0581591 FN

  A581591A58591 GN

  0581591A FN

  0581591BB GN

  05815910 FN

  '0581591 GN

   ;

run;

I want to apply below rules using SAS regular expressions. 

IF LENGTH(POOLNO) > 6  then do the following.

  1. Remove all leading zeros
  2. Remove 'MRS0' prefix if it exist
  3. If a space exists, only keep the portion of the string prior to the space and not including that space Example: ABC 123 -> ABC
  4. If the value exists twice, keep only one of them.  Example: ABC123ABC123 -> ABC123
  5. If the 7th nonzero character is alphabetical then only keep the 6 nonzero characters prior

# to that alphabetical character and not including it. Example: 123456A -> 123456

  1. Remove two trailing characters if they exist Example: 123456BB -> 123456
  2. Remove trailing zeroes after the sixth character Example: 12345600 -> 123456
  3. i. Remove leading '  Example: '123456 -> 123456

This is the desired sas dataset:

0581591 FN

585191  GN

0581591 FN

A581591 GN

0581591 FN

0581591 GN

0581591 FN

0581591 GN

Thanks in advance for you kind response.

1 ACCEPTED SOLUTION

Accepted Solutions
FriedEgg
SAS Employee

DATA given;

input ( poolno name ) (: $20.);

newpoolno=prxchange('s#([''\b]|MSR0)?(\w\d{4,5}[1-9])[0\w]{0,}\b#$2#',-1,prxchange('s#^00#0#',1,poolno));

  datalines;

  00581591 FN

  MSR0581591 GN

  0581591 FN

  A581591A58591 GN

  0581591A FN

  0581591BB GN

  05815910 FN

  '0581591 GN

   ;

run;

An additional note, taken in priority, your rules and results do not mesh perfectly.  I coded to match the results provided rather than then rules you listed.

poolnonamenewpoolno
00581591FN0581591
MSR0581591GN581591
0581591FN0581591
A581591A58591GNA581591
0581591AFN0581591
0581591BBGN0581591
05815910FN0581591
'0581591GN0581591

View solution in original post

2 REPLIES 2
FriedEgg
SAS Employee

DATA given;

input ( poolno name ) (: $20.);

newpoolno=prxchange('s#([''\b]|MSR0)?(\w\d{4,5}[1-9])[0\w]{0,}\b#$2#',-1,prxchange('s#^00#0#',1,poolno));

  datalines;

  00581591 FN

  MSR0581591 GN

  0581591 FN

  A581591A58591 GN

  0581591A FN

  0581591BB GN

  05815910 FN

  '0581591 GN

   ;

run;

An additional note, taken in priority, your rules and results do not mesh perfectly.  I coded to match the results provided rather than then rules you listed.

poolnonamenewpoolno
00581591FN0581591
MSR0581591GN581591
0581591FN0581591
A581591A58591GNA581591
0581591AFN0581591
0581591BBGN0581591
05815910FN0581591
'0581591GN0581591
zqkal
Obsidian | Level 7

sorry ,I made a mistake when writing the desired sas data set but i have what i needed.

Thanks for your help.


sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 897 views
  • 1 like
  • 2 in conversation