How to extract particular text from character string?

Reply
Frequent Contributor
Posts: 131

How to extract particular text from character string?

Hello!

I have a character variable for address which I'd like to use to create new variables for just the full address's

- street address

-city

-state

-zip

I am doing this using the substring function (i.e. state=substr(addr,length(addr)-1,2); )

however, the data is always entered so there is only space between the street address and city instead of, say, a comma.

Would anyone have suggestions for how I may go about creating the street address and city variables then?

The help is appreciated.

PROC Star
Posts: 8,007

Re: How to extract particular text from character string?

Posted in reply to Maisha_Huq

If your data are really that consistent then you might be able to use something like:

data have;

  informat street $100.;

  length city $30.;

  input street &;

  city=scan(substr(_infile_,length(street)+1,),1,',');

  state=scan(_infile_,-2,' ');

  zip=scan(_infile_,-1,' ');

  cards;

4552 east 32nd street, apt 4  new york, ny  12209-1234

23 nearby blvd.  pittsburgh, pa  15217

;

PROC Star
Posts: 8,007

Re: How to extract particular text from character string?

Posted in reply to Maisha_Huq

: Here is another thought. Depending upon the accuracy of your city names, you might be able to get around the various problems as follows:

data have;

  length address $150;

  input;

  address=_infile_;

  cards;

4552 east 32nd street, apt 4 albany, ny 12209-1234

23 nearby blvd. pittsburgh, pa 15217

12345 west   north street, apt 12 (3rd floor) Arlington Heights, IL 60004-6667

;

filename dummy temp;

data _null_;

   file dummy;

   set have;

   put address;

run;

data want (drop=citystate);

  informat street $100.;

  informat citystate $50.;

  length city $30;

  length state $3;

  informat zip $10.;

  infile dummy;

  input;

  _infile_=compbl(translate(_infile_,' ','09'x));

  zip=scan(_infile_,-1,' ');

  citystate=zipcity(substr(zip,1,5));

  city=scan(citystate,1,',');

  state=scan(citystate,2,',');

  street=substr(_infile_,1,find(upcase(_infile_),strip(upcase(city)))-1);

run;

Frequent Contributor
Posts: 131

Re: How to extract particular text from character string?

Hi Arthur,

I will try this!  Something I left out, though:  is there a way to extract a string (which we know is all uppercase) between two specific delimiters in SAS (i.e. my two delimiters are a double space and a comma)?

Super User
Super User
Posts: 9,026

Re: How to extract particular text from character string?

Posted in reply to Maisha_Huq

For fixed format strings you may take some time to read up on perl regular expressions.  It could save you some time writing all the necessary if thens.  http://www.cs.tut.fi/~jkorpela/perl/regexp.html

I fyou are still going to use string functions, then look at index().  So if I had:

001  ABC, 12

And I do substr(string,index(string,"  ")+2,index(string,",") - (index(string,"  ")+2));

This means start taking characters from where I find two spaces (+2 so I don't get the spaces) up to where I find the comma (less two for the original spaces).

Ask a Question
Discussion stats
  • 4 replies
  • 3815 views
  • 6 likes
  • 3 in conversation