I'm converting some code from Python to SAS and stuck at one point - regular expressions. Can someone confirm my assumption is correct please?
I think this line:
df['VALUE'] = df['VALUE'].replace(r'([a-z,A-Z])\w+',0)
Basically removes all character values and replaces it with 0?
Is this interpretation correct?
I can't run the program on the file otherwise I'd test it by running each program and comparing the results.
My python is basic, but as far as I know the syntax is:
NEW_STR=re.replace(REGEX, REPLACEMENT , OLD_STR)
1. You need the re. call (what's the value of df['VALUE'] ?) for the RegEx parser to be used
2. You need 3 parameters
3. I don't see how numbers are allowed
Regardless, the regular expression here matches:
- one lower case letter or comma or upper case letter
- followed by word characters (this includes digits and underscore)
Are these pandas dataframes? If so I would do it like this:
import re ... df['VAL'] = df['VAL'].apply(lambda x: re.sub('[A-Z]+', '0', x, flags=re.I))
It is panda's but I'm going in the opposite direction, from Pandas/Python to SAS code. The data is too big to be handled well in python.
If I read this RegEx correctly then it needs at least two characters to match; the first character needs to be a letter, the 2nd to n can be alphanumeric or an underscore.
data sample;
infile datalines truncover;
input source $char10.;
format target $char10.;
target=prxchange('s/[a-z,A-Z]\w+/0/i',1,source);
datalines;
Abc9xy
a
a9
_a
a_
_a_
123_a
123_ab
123a_
a123a_
X_123a_
X_123a_ bb
123_abc de
;
run;
> If I read this RegEx correctly then it needs at least two characters to match; the first character needs to be a letter, the 2nd to n can be alphanumeric or an underscore.
You mean:
If I read this RegEx correctly then it needs at least two characters to match; the first character needs to be a letter or a comma, the 2nd one can be alphanumeric or an underscore.
True, missed the comma.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.