Help using Base SAS procedures

Parting numbers and letters in one cell from eachother

Reply
Contributor
Posts: 43

Parting numbers and letters in one cell from eachother

I've googled it, and looked through the forums, but with no luck.

I think that i did this once, but i simply can't make anything work right now!

I have a adress cell, but with streetname and number in one cell. I would like SAS to recognize the number in the cell, and remove it to a new cell. Or simply create two new colums, one with characters, and one with numbes.

Thanks, and merry Christmas!
Contributor
Posts: 43

Re: Parting numbers and letters in one cell from eachother

Posted in reply to TMorville
I solved it in VBA.

But still looking for a SAS code! Message was edited by: TMorville
SAS Super FREQ
Posts: 8,861

Re: Parting numbers and letters in one cell from eachother

Posted in reply to TMorville
Hi:
I would imagine that some combination of SCAN, SUBSTR, ANYDIGIT, ANYALPHA, etc, etc functions would be useful. But, it seems to me that quite a bit about your data needs to be explained. For example, are all the addresses VERY regular:

1234 Sesame St.
98765 Chocolate Way


where the number ALWAYS comes first??? Or, could you have more irregular addresses:

North 1234 Grouch Lane
98765-A 34th St, South


For the "regular" data, getting the first chunk, if it's always the number is going to be a simple SCAN. But, for the "irregular" data, do you want to see the 1234 and then "North Grouch Lane" as one string or do you want to see 1234, then one string for the prefix "North" and a second variable for street name of "Grouch Lane".

What about 98765-A?? Seems like an apartment or a multi-unit building -- would you want the "-A" to be part of the "98765" (in which case, you have a field that cannot be just a number). Or, what about 34th -- a function would find the number 34, but the 34 is part of the street designation.

You originally said you wanted "SAS to recognize the number in the cell, and remove it to a new cell. Or simply create two new colums, one with characters, and one with numbers." It seems to me that without more information about the data and what to do in irregular situations, this is a hard one to help with beyond pointing you in the general direction of the character functions for parsing, splitting and extracting.

cynthia
Super User
Posts: 10,018

Re: Parting numbers and letters in one cell from eachother

Posted in reply to Cynthia_sas
Hi.
Cynthia ,I think Function COMPRESS() is more powerful and useful than what you refer to.
[pre]
data temp;
infile datalines truncover;
input address $100.;
datalines;
1234 Sesame St.
98765 Chocolate Way
North 1234 Grouch Lane
98765-A 34th St, South
;
run;
data address;
set temp;
only_digit=compress(address,,'ap');
only_char_punctuation=compress(address,,'d');
run;
proc print;run;

[/pre]


Ksharp Message was edited by: Ksharp
Contributor
Posts: 43

Re: Parting numbers and letters in one cell from eachother

I tryed the compress func, and it works with a bit of fiddleing around.

The adresses are all messed up. Some with . others with , - and so forth.

It could look like

Milky Way 52, 5 t.h.
MilkyWay 52-5 th
Milky Way 5 th. 2 v

And so forth. But i solved it in an OK manner.
SAS Super FREQ
Posts: 8,861

Re: Parting numbers and letters in one cell from eachother

Posted in reply to TMorville
Yes, COMPRESS would work for simple addresses (such as in my first example), but would be especially problematic for this address:

98765-A 34th St, South


If you look at observation 4 in your output from PROC PRINT, you will see that the "converted" result for the above address is:
[pre]
only_ only_char_
Obs address digit punctuation

4 98765-A 34th St, South 98765 34 -A th St, South
[/pre]

And my guess is that "-A th St, South" instead of "34th St, South" would be an unacceptable result that requires a bit more thought, conditional logic and coding.

cynthia
Ask a Question
Discussion stats
  • 5 replies
  • 115 views
  • 0 likes
  • 3 in conversation