Hi,
I have address filed like
12345 BROADWAY APT. B
or
12345 BROADWAY SUITE. B
or
12345 BROADWAY #B
I would like separte this filed into two variables, which are the address variable (12345 BROADWAY), and apt/suit number variable(APT. B, etc). I understand that I can use scan function to separte by '#' but how I can separte by word 'APT' or 'SUIT'? Any idea would be helpful!
var1=scan(addr, 1, '#');
var2=scan(addr, 2, '#');
Use a regular expression. Find a valid pattern here:
http://www.regexlib.com/Search.aspx?k=street+address&c=-1&m=-1&ps=20
That is the simplest and most powerful way of handling it.Split the address into groups and then just assign the groups to the correct variable.
I agree with @AlanC.
Regular expressions are a must for the kind of work you are doing. The learning curve is very steep though.
Here is stg to get you started:
data HAVE;
ADD=' 2345 BROADWAY APT. B '; output;
ADD='12345 BROADWAY SUITE. B '; output;
ADD='12345 BROADWAY #B '; output;
run;
data WANT;
set HAVE;
ADD1=prxchange('s/(.*?)(APT|SUITE|#)\.?\s*(\w+)\s+/$1/i',1,ADD);
ADD2=prxchange('s/(.*?)(APT|SUITE|#)\.?\s*(\w+)\s+/$3/i',1,ADD);
run;
ADD | ADD1 | ADD2 |
---|---|---|
2345 BROADWAY APT. B | 2345 BROADWAY | B |
12345 BROADWAY SUITE. B | 12345 BROADWAY | B |
12345 BROADWAY #B | 12345 BROADWAY | B |
Explanation of the expression:
s/ we are going to substitute (rather than just match)
(.*?) first group. we match any characters (? means lazily: we limit the extend of the match as we don't want to prevent other matches)
(APT|SUITE|#) second group: match any of these strings
\.? optionally match a dot
\s* optionally match any number of spaces
(\w+) third group: match at least one word character (that's the unit number)
\s+ match remaining spaces
/$1 replace all we've matched with the first group (i.e. the address proper)
/i make the logic case insensitive
I know looks daunting, but once you get there, you will be rewarded!
The best way, IMO, to learn regex is to use RegexBuddy.
Worth every penny and then some.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.