- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data test;
input name $25.;
cards;
x., test
.x test
test x.
;
run;
data want;
set test;
a=prxchange('s/\b(x\.,|x\.|\x)\b//i', -1, name);
run;
Hello there!
I'm trying to extract from string literal all parts which looks like x or x. or x.,
So I've wrtitten the code above
A result dataset you can see in the table below
It seems that SAS had deleted only x character, not x. or x.,
despite fact of existence x\. and x\., in regular expression before x
Does anyone know how it could be fixed?
Resulted dataset:
name | a | |
1 | x., test | ., test |
2 | .x test | . test |
3 | test x. | test . |
Desired dataset:
name | a | |
1 | x., test | test |
2 | .x test | . test |
3 | test x. | test |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Ga1ath and welcome to the SAS Support Communities!
I see two issues:
- A word boundary (PRX metacharacter \b) cannot occur directly after a comma or a period because by definition it is "the position between a word and a space" -- but neither the comma nor the period is a word character (cf. PRX metacharacter \w).
- The "\x" in your regular expression should read "|x".
Try this regex instead:
a=prxchange('s/\b(x\.,|x\.|x\b)//i', -1, name);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Ga1ath and welcome to the SAS Support Communities!
I see two issues:
- A word boundary (PRX metacharacter \b) cannot occur directly after a comma or a period because by definition it is "the position between a word and a space" -- but neither the comma nor the period is a word character (cf. PRX metacharacter \w).
- The "\x" in your regular expression should read "|x".
Try this regex instead:
a=prxchange('s/\b(x\.,|x\.|x\b)//i', -1, name);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for reply!
All clear except second issue. You'h wrote The "\x" in your regular expression should read "|x".
But I don't see \x in my primordial regular expression. Please, can you explain what you have meant?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ga1ath wrote:
All clear except second issue. You'h wrote The "\x" in your regular expression should read "|x".
But I don't see \x in my primordial regular expression. Please, can you explain what you have meant?
I meant this:
@Ga1ath wrote:
data want; set test; a=prxchange('s/\b(x\.,|x\.\x)\b//i', -1, name); run;
whereas
@FreelanceReinh wrote:
a=prxchange('s/\b(x\.,|x\.|x\b)//i', -1, name);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Of course, yeah. I'm sorry for my inattention.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is that last table supposed to be what you want to produce? Or is it the wrong output your current code is producing? If the latter then please provide the desired results for your example input.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is the wrong output. I'll edit the post.