BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PrudhviB
Obsidian | Level 7

Hi SAS Experts i have below string and i am trying to extract the number that is in the following highlighted format, the issue is the format is not consistent but i would like to extract anything after "reference # " just till the numbers as shown below.

 

string:

“Send email to  reference, Mailbox..*        Email Subject Line - reference Follow-Up....Body of email (complete template below).... ....student Name.... James GENEVIEVE...Student Phone #.... 12345712457 best, 5458712458....reference #.... 012000222.1.... .... ONFIDENTIALITY NOTICE..This communication may contain privileged or confidential information. If you are not the intended recipient or received this communication by error, please notify the sender and delete the message without copying or disclosing it. Thank you..... .... .."

 

desired output:

"012000221.1"

 

in advance, thank you for your effort.

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Depending on the exact requirement, another option:

data WANT;
  LINE="Send email to  reference, Mailbox..*        Email Subject Line - reference Follow-Up....Body of email (complete template below).... ....student Name.... James GENEVIEVE...Student Phone #.... 12345712457 best, 5458712458....reference #.... 012000222.1.... .... ONFIDENTIALITY NOTICE..This communication may contain privileged or confidential information. If you are not the intended recipient or received this communication by error, please notify the sender and delete the message without copying or disclosing it. Thank you..... .... ..";
  STR=prxchange('s/'           %* substitution requested;
              ||'.*'           %* match anything;
              ||'reference #'  %* then reference space hash;
              ||'[^\d]*'       %* then more optional text except digits ;
              ||'(\d+\.?\d*)'  %* then digits including an optional embedded dot <= capture this;
              ||'.*'           %* then the rest of the string;
              ||'/\1'          %* replace all with captured group ;
              ||'/',1,LINE);
  putlog STR=;
run;

STR=012000222.1

 

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

A regular expression could help. What exactly will work depends on the details/variations in your actual data.

data have;
  infile datalines truncover;
  length string $2000;
  retain string;
  input;
  string=catx(' ',string,_infile_);
  if _n_=10 then output;
datalines;
Send email to  reference, Mailbox..*        
Email Subject Line - reference Follow-Up....
Body of email (complete template below).... ....
student Name.... James GENEVIEVE...Student Phone #.... 
12345712457 best, 5458712458....reference #.... 012000222.1.... .... 
ONFIDENTIALITY NOTICE..This communication may contain privileged or 
confidential information. If you are not the intended recipient or 
received this communication by error, please notify the sender 
and delete the message without copying or disclosing it. 
Thank you..... .... ..
;

data want;
  set have;
  length want_str $32;
  _prxid=prxparse('/reference #[^#\d]*(\d+\.?\d*)/oi');
  if prxmatch(_prxid,trim(string))>0 then 
    want_str=prxposn(_prxid,1,trim(string));
run;

proc print data=want;
  var want_str;
run;
ChrisNZ
Tourmaline | Level 20

Depending on the exact requirement, another option:

data WANT;
  LINE="Send email to  reference, Mailbox..*        Email Subject Line - reference Follow-Up....Body of email (complete template below).... ....student Name.... James GENEVIEVE...Student Phone #.... 12345712457 best, 5458712458....reference #.... 012000222.1.... .... ONFIDENTIALITY NOTICE..This communication may contain privileged or confidential information. If you are not the intended recipient or received this communication by error, please notify the sender and delete the message without copying or disclosing it. Thank you..... .... ..";
  STR=prxchange('s/'           %* substitution requested;
              ||'.*'           %* match anything;
              ||'reference #'  %* then reference space hash;
              ||'[^\d]*'       %* then more optional text except digits ;
              ||'(\d+\.?\d*)'  %* then digits including an optional embedded dot <= capture this;
              ||'.*'           %* then the rest of the string;
              ||'/\1'          %* replace all with captured group ;
              ||'/',1,LINE);
  putlog STR=;
run;

STR=012000222.1

 

Patrick
Opal | Level 21

@ChrisNZ 

Using prxchange() is what I've done first as well but then realized that you'll end up with the source string if there is no match.

ChrisNZ
Tourmaline | Level 20

@Patrick True

if STR=LINE then STR=' ';

Not saying this is better. Just another option. 🙂

RegEx are expensive, I prefer to use them just once if possible.

Patrick
Opal | Level 21
Doesn’t prxposn() just retrieve the capture buffer without parsing the source string again?
ChrisNZ
Tourmaline | Level 20

> Doesn’t prxposn() just retrieve the capture buffer without parsing the source string again? 

You are right. Your solution is actually more efficient. I never used to use this function but I'll keep in mind now.

Thanks for the heads up. 🙂

 

 

PrudhviB
Obsidian | Level 7
Just Amazing this worked like charm and not only that i learn something new with your detailed explanation @ChrisNZ
if there is a reference page where i can get to learn this please do share.
ChrisNZ
Tourmaline | Level 20

> a reference page

This syntax is called a regular expression and there a tons of didactic resources online. That's how I learnt. 🙂

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 683 views
  • 2 likes
  • 3 in conversation