BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SIgnificatif
Quartz | Level 8

Hi all; I would like you assist me with this exercise:

 

I would like to :
1) find a phrase 'left home' (in red)  between 2 words (reporting: and information:) in blue and assign a value to variable property = left home:

2) find a phrase 'left home8' (in purple) after the word Tasks: (in green) and asign value to variable variant=left0-10 ( the number after left home could be either 0-10, so need to do that in one rule and not 10...)
here is the text :

some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 

Tasks:
provide information about 'left home8'

Thank you for your support.

1 ACCEPTED SOLUTION

Accepted Solutions
Oligolas
Barite | Level 11

Apologies, I misunderstood you were looking for Terms between the words reporting and Information.

Here is the updated code (please note that regex are case sensitive):

 

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;

   pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
   pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
   /*
   (.*(?<=reporting:).*)  # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
             (left home)  # The text to search: left home
   (.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text 
   /$1XXX$3/              # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
   */
   replace1=prxchange(pattern1,-1,a);
   property=prxchange(pattern2,-1,a);

   pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
   pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
   /*
   (.*(?<=Tasks:).*)      # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text 
   (left home(10|\d))     # The text to search: left home, followed by 10 or any digit
   (.*)                   # Any text
   /$1YYY$3/              # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
   */
   replace2=prxchange(pattern3,-1,a);
   variant=prxchange(pattern4,-1,a);
run;

 

________________________

- Cheers -

View solution in original post

7 REPLIES 7
SIgnificatif
Quartz | Level 8

EDITED: sorry the forum don"t allow proper formating of colors... so the color hints are not working...

Oligolas
Barite | Level 11

Hi,

 

try this:

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;
   pattern1="s/(\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b)/$1XXX$3/";
   replace1=prxchange(pattern1,-1,a);
   property=prxchange("s/(.*\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b.*)/$2/",-1,a);

   pattern2="s/(.*Tasks:.*)(left home\d{1,2})(.*)/$1YYY$3/";
   replace2=prxchange(pattern2,-1,a);
   variant=prxchange("s/(.*Tasks:.*)(left home(10|\d))(\W?.*)/$2/",-1,a);
run;
________________________

- Cheers -

SIgnificatif
Quartz | Level 8

Thank you very much for your answer, could you precise how to use the exact words I have mentionned ? because I need to start searching between these words ( and after in the case of Tasks: )

reporting:

(and the ending word)

information:

and also the word after which the program will scan for information ?

tasks:



Thank you for this.

Sincerely.

Oligolas
Barite | Level 11

Apologies, I misunderstood you were looking for Terms between the words reporting and Information.

Here is the updated code (please note that regex are case sensitive):

 

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;

   pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
   pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
   /*
   (.*(?<=reporting:).*)  # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
             (left home)  # The text to search: left home
   (.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text 
   /$1XXX$3/              # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
   */
   replace1=prxchange(pattern1,-1,a);
   property=prxchange(pattern2,-1,a);

   pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
   pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
   /*
   (.*(?<=Tasks:).*)      # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text 
   (left home(10|\d))     # The text to search: left home, followed by 10 or any digit
   (.*)                   # Any text
   /$1YYY$3/              # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
   */
   replace2=prxchange(pattern3,-1,a);
   variant=prxchange(pattern4,-1,a);
run;

 

________________________

- Cheers -

SIgnificatif
Quartz | Level 8

Thank you dear Oligolas for your great response ! 🙂 it's a good solution, 
may I precise how to capture the number after the home ?  example home8 , home9 etc regardless of the space ( home8 or home 😎
what is the best way to do it ? with RE  ( \s?) or buil-in SAS functions ? what's your experience on thtis ?
Sincerely.

Oligolas
Barite | Level 11

Hi,

 

you could define a new capturing Group for the numbers:

data have;
length a $500;
a="some text here

reporting: by Mr. Jones

new 'left home' was selected

information:
the new home was found 
Tasks:
provide information about 'left home8'
";
run;

data want;
   length replace1 property replace2 variant $500;
   set have;

   pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
   pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
   /*
   (.*(?<=reporting:).*)  # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
             (left home)  # The text to search: left home
   (.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text 
   /$1XXX$3/              # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
   */
   replace1=prxchange(pattern1,-1,a);
   property=prxchange(pattern2,-1,a);

   pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$1YYY$4/');
   pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$2$3/');*# same as before only capturing the second group that is left home with specified digits;
   pattern5=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$3/');*# capture the specified digits;
   /*
   (.*(?<=Tasks:).*)        # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text 
   (left home(10|\d))(?!\d) # The text to search: left home, followed by 10 or any (single) digit
   (.*)                     # Any text
   /$1YYY$3/                # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
   */
   replace2=prxchange(pattern3,-1,a);
   variant=prxchange(pattern4,-1,a);
   number=prxchange(pattern5,-1,a);
run;

 

 

________________________

- Cheers -

SIgnificatif
Quartz | Level 8

Thank you for this, appreciate your great support !

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2411 views
  • 0 likes
  • 2 in conversation