Hi all; I would like you assist me with this exercise:
I would like to :
1) find a phrase 'left home' (in red) between 2 words (reporting: and information:) in blue and assign a value to variable property = left home:
2) find a phrase 'left home8' (in purple) after the word Tasks: (in green) and asign value to variable variant=left0-10 ( the number after left home could be either 0-10, so need to do that in one rule and not 10...)
here is the text :
some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'Thank you for your support.
Apologies, I misunderstood you were looking for Terms between the words reporting and Information.
Here is the updated code (please note that regex are case sensitive):
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
/*
(.*(?<=reporting:).*) # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
(left home) # The text to search: left home
(.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text
/$1XXX$3/ # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
*/
replace1=prxchange(pattern1,-1,a);
property=prxchange(pattern2,-1,a);
pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
/*
(.*(?<=Tasks:).*) # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text
(left home(10|\d)) # The text to search: left home, followed by 10 or any digit
(.*) # Any text
/$1YYY$3/ # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
*/
replace2=prxchange(pattern3,-1,a);
variant=prxchange(pattern4,-1,a);
run;
- Cheers -
EDITED: sorry the forum don"t allow proper formating of colors... so the color hints are not working...
Hi,
try this:
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1="s/(\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b)/$1XXX$3/";
replace1=prxchange(pattern1,-1,a);
property=prxchange("s/(.*\b\w+\b\s+\W)(left home)(\W?\s+\b\w+\b.*)/$2/",-1,a);
pattern2="s/(.*Tasks:.*)(left home\d{1,2})(.*)/$1YYY$3/";
replace2=prxchange(pattern2,-1,a);
variant=prxchange("s/(.*Tasks:.*)(left home(10|\d))(\W?.*)/$2/",-1,a);
run;
- Cheers -
Thank you very much for your answer, could you precise how to use the exact words I have mentionned ? because I need to start searching between these words ( and after in the case of Tasks: )
reporting:(and the ending word)
information:and also the word after which the program will scan for information ?
tasks:
Thank you for this.
Sincerely.
Apologies, I misunderstood you were looking for Terms between the words reporting and Information.
Here is the updated code (please note that regex are case sensitive):
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
/*
(.*(?<=reporting:).*) # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
(left home) # The text to search: left home
(.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text
/$1XXX$3/ # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
*/
replace1=prxchange(pattern1,-1,a);
property=prxchange(pattern2,-1,a);
pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$1YYY$3/');
pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home(10|\d))(.*)/$2/');*# same as before only capturing the second group that is left home with specified digits;
/*
(.*(?<=Tasks:).*) # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text
(left home(10|\d)) # The text to search: left home, followed by 10 or any digit
(.*) # Any text
/$1YYY$3/ # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
*/
replace2=prxchange(pattern3,-1,a);
variant=prxchange(pattern4,-1,a);
run;
- Cheers -
Thank you dear Oligolas for your great response ! 🙂 it's a good solution,
may I precise how to capture the number after the home ? example home8 , home9 etc regardless of the space ( home8 or home 😎
what is the best way to do it ? with RE ( \s?) or buil-in SAS functions ? what's your experience on thtis ?
Sincerely.
Hi,
you could define a new capturing Group for the numbers:
data have;
length a $500;
a="some text here
reporting: by Mr. Jones
new 'left home' was selected
information:
the new home was found
Tasks:
provide information about 'left home8'
";
run;
data want;
length replace1 property replace2 variant $500;
set have;
pattern1=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$1XXX$3/');
pattern2=prxparse('s/(.*(?<=reporting:).*)(left home)(.*(?=information:).*)/$2/');*# same as before only capturing the second group that is left home;
/*
(.*(?<=reporting:).*) # Any text followed by reporting: (as positive lookbehind assertion) followed by any text
(left home) # The text to search: left home
(.*(?=information:).*) # Any text followed by information: (as positive lookahead assertion) followed by any text
/$1XXX$3/ # the replacement string: first capturing group, followed by XXX instead of left home, followed by the rest of the text
*/
replace1=prxchange(pattern1,-1,a);
property=prxchange(pattern2,-1,a);
pattern3=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$1YYY$4/');
pattern4=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$2$3/');*# same as before only capturing the second group that is left home with specified digits;
pattern5=prxparse('s/(.*(?<=Tasks:).*)(left home)(10|\d)(?!\d)(.*)/$3/');*# capture the specified digits;
/*
(.*(?<=Tasks:).*) # Any text followed by Tasks: (as positive lookbehind assertion) followed by any text
(left home(10|\d))(?!\d) # The text to search: left home, followed by 10 or any (single) digit
(.*) # Any text
/$1YYY$3/ # the replacement string: first capturing group, followed by YYY instead of left home with specified digits, followed by any text in the 3rd capturing group
*/
replace2=prxchange(pattern3,-1,a);
variant=prxchange(pattern4,-1,a);
number=prxchange(pattern5,-1,a);
run;
- Cheers -
Thank you for this, appreciate your great support !
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.