Hi,
I have address filed like
12345 BROADWAY APT. B
or
12345 BROADWAY SUITE. B
or
12345 BROADWAY #B
I would like separte this filed into two variables, which are the address variable (12345 BROADWAY), and apt/suit number variable(APT. B, etc). I understand that I can use scan function to separte by '#' but how I can separte by word 'APT' or 'SUIT'? Any idea would be helpful!
var1=scan(addr, 1, '#');
var2=scan(addr, 2, '#');
Use a regular expression. Find a valid pattern here:
http://www.regexlib.com/Search.aspx?k=street+address&c=-1&m=-1&ps=20
That is the simplest and most powerful way of handling it.Split the address into groups and then just assign the groups to the correct variable.
I agree with @AlanC.
Regular expressions are a must for the kind of work you are doing. The learning curve is very steep though.
Here is stg to get you started:
data HAVE;
ADD=' 2345 BROADWAY APT. B '; output;
ADD='12345 BROADWAY SUITE. B '; output;
ADD='12345 BROADWAY #B '; output;
run;
data WANT;
set HAVE;
ADD1=prxchange('s/(.*?)(APT|SUITE|#)\.?\s*(\w+)\s+/$1/i',1,ADD);
ADD2=prxchange('s/(.*?)(APT|SUITE|#)\.?\s*(\w+)\s+/$3/i',1,ADD);
run;
ADD | ADD1 | ADD2 |
---|---|---|
2345 BROADWAY APT. B | 2345 BROADWAY | B |
12345 BROADWAY SUITE. B | 12345 BROADWAY | B |
12345 BROADWAY #B | 12345 BROADWAY | B |
Explanation of the expression:
s/ we are going to substitute (rather than just match)
(.*?) first group. we match any characters (? means lazily: we limit the extend of the match as we don't want to prevent other matches)
(APT|SUITE|#) second group: match any of these strings
\.? optionally match a dot
\s* optionally match any number of spaces
(\w+) third group: match at least one word character (that's the unit number)
\s+ match remaining spaces
/$1 replace all we've matched with the first group (i.e. the address proper)
/i make the logic case insensitive
I know looks daunting, but once you get there, you will be rewarded!
The best way, IMO, to learn regex is to use RegexBuddy.
Worth every penny and then some.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.