Hi all; I would like you assist me with this exercise:
I would like to :
1) find a phrase 'left home'
(in red) between 2 words (reporting: and information:)
in blue and assign a value to variable property = left home:
2) find a phrase 'left home8' (
in purple) after the word Tasks: (in green) and asign value to variable variant=left0-10 ( the number after left home could be either 0-10, so need to do that in one rule and not 10...)
here is the text :
some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
Thank you for your support.
Apologies, I misunderstood you were looking for Terms between the words reporting and Information.
Here is the updated code (please note that regex are case sensitive):
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
/*
(.*(?<=reporting:).*) # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
(left home) # The text to search: left home
(.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text
/$1XXX$3/ # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
*/
replace1=prxchange(pattern1,-1,a);
property=prxchange(pattern2,-1,a);
pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
/*
(.*(?<=Tasks:).*) # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text
(left home(10|\d)) # The text to search: left home, followed by 10 or any digit
(.*) # Any text
/$1YYY$3/ # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
*/
replace2=prxchange(pattern3,-1,a);
variant=prxchange(pattern4,-1,a);
run;
- Cheers -
EDITED: sorry the forum don"t allow proper formating of colors... so the color hints are not working...
Hi,
try this:
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1="s/(\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b)/$1XXX$3/";
replace1=prxchange(pattern1,-1,a);
property=prxchange("s/(.*\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b.*)/$2/",-1,a);
pattern2="s/(.*Tasks:.*)(left home\d{1,2})(.*)/$1YYY$3/";
replace2=prxchange(pattern2,-1,a);
variant=prxchange("s/(.*Tasks:.*)(left home(10|\d))(\W?.*)/$2/",-1,a);
run;
- Cheers -
Thank you very much for your answer, could you precise how to use the exact words I have mentionned ? because I need to start searching between these words ( and after in the case of Tasks: )
reporting:
(and the ending word)
information:
and also the word after which the program will scan for information ?
tasks:
Thank you for this.
Sincerely.
Apologies, I misunderstood you were looking for Terms between the words reporting and Information.
Here is the updated code (please note that regex are case sensitive):
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
/*
(.*(?<=reporting:).*) # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
(left home) # The text to search: left home
(.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text
/$1XXX$3/ # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
*/
replace1=prxchange(pattern1,-1,a);
property=prxchange(pattern2,-1,a);
pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
/*
(.*(?<=Tasks:).*) # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text
(left home(10|\d)) # The text to search: left home, followed by 10 or any digit
(.*) # Any text
/$1YYY$3/ # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
*/
replace2=prxchange(pattern3,-1,a);
variant=prxchange(pattern4,-1,a);
run;
- Cheers -
Thank you dear Oligolas for your great response ! 🙂 it's a good solution,
may I precise how to capture the number after the home ? example home8 , home9 etc regardless of the space ( home8 or home 😎
what is the best way to do it ? with RE ( \s?) or buil-in SAS functions ? what's your experience on thtis ?
Sincerely.
Hi,
you could define a new capturing Group for the numbers:
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
/*
(.*(?<=reporting:).*) # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
(left home) # The text to search: left home
(.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text
/$1XXX$3/ # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
*/
replace1=prxchange(pattern1,-1,a);
property=prxchange(pattern2,-1,a);
pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$1YYY$4/');
pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$2$3/');*# same as before only capturing the second group that is left home with specified digits;
pattern5=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$3/');*# capture the specified digits;
/*
(.*(?<=Tasks:).*) # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text
(left home(10|\d))(?!\d) # The text to search: left home, followed by 10 or any (single) digit
(.*) # Any text
/$1YYY$3/ # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
*/
replace2=prxchange(pattern3,-1,a);
variant=prxchange(pattern4,-1,a);
number=prxchange(pattern5,-1,a);
run;
- Cheers -
Thank you for this, appreciate your great support !
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.