I'm converting some code from Python to SAS and stuck at one point - regular expressions. Can someone confirm my assumption is correct please?
I think this line:
df['VALUE'] = df['VALUE'].replace(r'([a-z,A-Z])\w+',0)
Basically removes all character values and replaces it with 0?
Is this interpretation correct?
I can't run the program on the file otherwise I'd test it by running each program and comparing the results.
My python is basic, but as far as I know the syntax is:
NEW_STR=re.replace(REGEX, REPLACEMENT , OLD_STR)
1. You need the re. call (what's the value of df['VALUE'] ?) for the RegEx parser to be used
2. You need 3 parameters
3. I don't see how numbers are allowed
Regardless, the regular expression here matches:
- one lower case letter or comma or upper case letter
- followed by word characters (this includes digits and underscore)
Are these pandas dataframes? If so I would do it like this:
import re ... df['VAL'] = df['VAL'].apply(lambda x: re.sub('[A-Z]+', '0', x, flags=re.I))
It is panda's but I'm going in the opposite direction, from Pandas/Python to SAS code. The data is too big to be handled well in python.
If I read this RegEx correctly then it needs at least two characters to match; the first character needs to be a letter, the 2nd to n can be alphanumeric or an underscore.
data sample;
infile datalines truncover;
input source $char10.;
format target $char10.;
target=prxchange('s/[a-z,A-Z]\w+/0/i',1,source);
datalines;
Abc9xy
a
a9
_a
a_
_a_
123_a
123_ab
123a_
a123a_
X_123a_
X_123a_ bb
123_abc de
;
run;
> If I read this RegEx correctly then it needs at least two characters to match; the first character needs to be a letter, the 2nd to n can be alphanumeric or an underscore.
You mean:
If I read this RegEx correctly then it needs at least two characters to match; the first character needs to be a letter or a comma, the 2nd one can be alphanumeric or an underscore.
True, missed the comma.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.