BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SIgnificatif
Quartz | Level 8

Hi all; I would like you assist me with this exercise:

 

I would like to :
1) find a phrase 'left home' (in red)  between 2 words (reporting: and information:) in blue and assign a value to variable property = left home:

2) find a phrase 'left home8' (in purple) after the word Tasks: (in green) and asign value to variable variant=left0-10 ( the number after left home could be either 0-10, so need to do that in one rule and not 10...)
here is the text :

some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 

Tasks:
provide information about 'left home8'

Thank you for your support.

1 ACCEPTED SOLUTION

Accepted Solutions
Oligolas
Barite | Level 11

Apologies, I misunderstood you were looking for Terms between the words reporting and Information.

Here is the updated code (please note that regex are case sensitive):

 

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;

   pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
   pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
   /*
   (.*(?<=reporting:).*)  # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
             (left home)  # The text to search: left home
   (.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text 
   /$1XXX$3/              # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
   */
   replace1=prxchange(pattern1,-1,a);
   property=prxchange(pattern2,-1,a);

   pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
   pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
   /*
   (.*(?<=Tasks:).*)      # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text 
   (left home(10|\d))     # The text to search: left home, followed by 10 or any digit
   (.*)                   # Any text
   /$1YYY$3/              # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
   */
   replace2=prxchange(pattern3,-1,a);
   variant=prxchange(pattern4,-1,a);
run;

 

________________________

- Cheers -

View solution in original post

7 REPLIES 7
SIgnificatif
Quartz | Level 8

EDITED: sorry the forum don"t allow proper formating of colors... so the color hints are not working...

Oligolas
Barite | Level 11

Hi,

 

try this:

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;
   pattern1="s/(\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b)/$1XXX$3/";
   replace1=prxchange(pattern1,-1,a);
   property=prxchange("s/(.*\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b.*)/$2/",-1,a);

   pattern2="s/(.*Tasks:.*)(left home\d{1,2})(.*)/$1YYY$3/";
   replace2=prxchange(pattern2,-1,a);
   variant=prxchange("s/(.*Tasks:.*)(left home(10|\d))(\W?.*)/$2/",-1,a);
run;
________________________

- Cheers -

SIgnificatif
Quartz | Level 8

Thank you very much for your answer, could you precise how to use the exact words I have mentionned ? because I need to start searching between these words ( and after in the case of Tasks: )

reporting:

(and the ending word)

information:

and also the word after which the program will scan for information ?

tasks:



Thank you for this.

Sincerely.

Oligolas
Barite | Level 11

Apologies, I misunderstood you were looking for Terms between the words reporting and Information.

Here is the updated code (please note that regex are case sensitive):

 

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;

   pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
   pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
   /*
   (.*(?<=reporting:).*)  # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
             (left home)  # The text to search: left home
   (.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text 
   /$1XXX$3/              # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
   */
   replace1=prxchange(pattern1,-1,a);
   property=prxchange(pattern2,-1,a);

   pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
   pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
   /*
   (.*(?<=Tasks:).*)      # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text 
   (left home(10|\d))     # The text to search: left home, followed by 10 or any digit
   (.*)                   # Any text
   /$1YYY$3/              # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
   */
   replace2=prxchange(pattern3,-1,a);
   variant=prxchange(pattern4,-1,a);
run;

 

________________________

- Cheers -

SIgnificatif
Quartz | Level 8

Thank you dear Oligolas for your great response ! 🙂 it's a good solution, 
may I precise how to capture the number after the home ?  example home8 , home9 etc regardless of the space ( home8 or home 😎
what is the best way to do it ? with RE  ( \s?) or buil-in SAS functions ? what's your experience on thtis ?
Sincerely.

Oligolas
Barite | Level 11

Hi,

 

you could define a new capturing Group for the numbers:

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;

   pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
   pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
   /*
   (.*(?<=reporting:).*)  # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
             (left home)  # The text to search: left home
   (.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text 
   /$1XXX$3/              # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
   */
   replace1=prxchange(pattern1,-1,a);
   property=prxchange(pattern2,-1,a);

   pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$1YYY$4/');
   pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$2$3/');*# same as before only capturing the second group that is left home with specified digits;
   pattern5=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$3/');*# capture the specified digits;
   /*
   (.*(?<=Tasks:).*)        # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text 
   (left home(10|\d))(?!\d) # The text to search: left home, followed by 10 or any (single) digit
   (.*)                     # Any text
   /$1YYY$3/                # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
   */
   replace2=prxchange(pattern3,-1,a);
   variant=prxchange(pattern4,-1,a);
   number=prxchange(pattern5,-1,a);
run;

 

 

________________________

- Cheers -

SIgnificatif
Quartz | Level 8

Thank you for this, appreciate your great support !

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 930 views
  • 0 likes
  • 2 in conversation