some people recommend the address cleaning services. ~ Perhaps that is, or should be, part of the Data Integration/Data Flux, or the "Q in ETL to the power of Q".
Although this is more of a language thing, it might be nice to have a proc that would help with address cleaning!
The immediate parsing of the numbers in the address is less of a problem than deciding for what that number is provided, e.g.floor/apartment/building/business-park-block number!
OK here is the first number [pre] number1 = input( scan( full_address,1, 'a' ), 32. );[/pre]
Just alter that 1 to N to get the N-th number out of the address.
I shall have to wait for SAS9.2 to see if the addition of 'a' modifier to SCAN() works the way that suits this challenge.
Here is something I've tested in SAS9.1.3, with the examples Cynthia provided[pre]data trial;
infile cards;
input full_address $char80. ;
number1 = input( substr( full_address, 1
, findc( trim(full_address) !!'0-'
, ' 1234567890', 'v' ) -1
)
, 32. );
list;cards;
0001050 Elm
00001050 Elm??? or
...[/pre]
It is not as attractive as the SAS9.2 solution, but for now, it delivers. ~~ as below
[pre]+FSVIEW: WORK.TRIAL (B)--------------------------------------+
| number1 full_address |
| |
| 1050 0001050 Elm |
| 1050 00001050 Elm??? or |
| 123 000123 NE 1st St??? Is the field justified|
| 123 123 Elm |
| 123 0123 Elm |
| 123 00123 Elm |
| 123 000123 Elm |
| 123 0000123 Elm |
| 123 00000123 Elm |
| 1050 1050 Elm |
| 1050 001050 Elm |
| 1050 01050 Elm |
| 1050 00001050 Elm |
| 1111 1111 Twisted Turny Long Avenue, NE |
| . |
| 150 0000150TH and |
| 1050 0001050 Elm if you don't stop it at the rig|
| 0 0000NE 1st ST (so you'd stop at the alpha c|
| 0 000SE 10th Ave |
| 0 000RR #3 |
| 0 0000SR 23 |
| 0 0000PO Box 61 |
| 123 123 Elm might have these variations: |
| . |
| 123 000000000000000000000000000123 Elm |
| 123 000000000000000000000000123 Elm St |
| 123 00000000000000000000000123 Elm St. |
| 123 00000000000000000000123 Elm Street |
| 1050 000000000000000000000000001050 Elm |
| 1234 0000001234 NE 1st Street, Extended |
| 1111 1111 Twisted Turny Long Avenue, NE |
| |
+-------------------------------------------------------------+[/pre]
missing values for "number1" were created for blank datalines.
PeterC
... View more