For a variable I have, I'd like to censor all addresses it contains if there is any.
e.g.
data _null_;
x = prxchange("s/(\w+) \b(STREET)\b/*LOCATION REMOVED*/",-1, 'I WAS WALKING ON QUEEN STREET IN THE MORNING');
put x=;
run;
But I also want to create exceptions, where prxchange ignores strings like 'THE STREET', as in 'I WAS DRIVING ON THE STREET'
Is it possible to do this?
Thanks in advance
You need to use a negative look-behind assertion.
data _null_;
x = prxchange("s/(\w+)(?<!THE) \b(STREET)\b/*LOCATION REMOVED*/",-1, 'I WAS WALKING IN THE STREET IN THE MORNING');
put x=;
x = prxchange("s/(\w+)(?<!THE) \b(STREET)\b/*LOCATION REMOVED*/",-1, 'I WAS WALKING ON QUEEN STREET IN THE MORNING');
put x=;
run;
x=I WAS WALKING IN THE STREET IN THE MORNING
x=I WAS WALKING ON *LOCATION REMOVED* IN THE MORNING
@eyp500 wrote:
Is it possible to do this?
Better question - how accurate do you need it to be? If you miss a few will it matter?
Ideally we can't afford to not remove actual street names. We can censor too many and not risking showing actual addresses, but end user would prefer unnecessary censoring to be minimised. We do have a list of things we know we are safe to avoid replacing, such as 'street lamp', 'street light', 'residential street'.
You would be more likely to achieve this if the street names were part of a finite set (like street names within a city). When you found one of those, you could confirm that it is used as a street name by its context.
You need to use a negative look-behind assertion.
data _null_;
x = prxchange("s/(\w+)(?<!THE) \b(STREET)\b/*LOCATION REMOVED*/",-1, 'I WAS WALKING IN THE STREET IN THE MORNING');
put x=;
x = prxchange("s/(\w+)(?<!THE) \b(STREET)\b/*LOCATION REMOVED*/",-1, 'I WAS WALKING ON QUEEN STREET IN THE MORNING');
put x=;
run;
x=I WAS WALKING IN THE STREET IN THE MORNING
x=I WAS WALKING ON *LOCATION REMOVED* IN THE MORNING
Thank you so much! This works perfectly
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: