Solved
Contributor
Posts: 54

# 5 digit zip code

Hi all

One variable in my data set is zip code. It should be a 5 digit number, but of course, for some records it's text, for example, "000NA","BMT ","BROOK","DR000","ENUE","FLR","L2J1P","P1470","ST000","UNKNO". Also some records just have “9” or some other single digit number. So in the original program, I read it as character.

Is there a way to look through all the records and any time zip codes is not a 5 digit number, change the value to missing? Then I can treat it as a number.

Thanks

Gene

Accepted Solutions
Solution
‎10-08-2015 09:23 AM
SAS Employee
Posts: 170

## Re: 5 digit zip code

If you have 5 character alphabetic characters, can't you use the following?

if (anyalpha(ZIP)=0 and length(ZIP) = 5) then ZipNum=input(Zip,5.);

All Replies
Super User
Posts: 11,343

## Re: 5 digit zip code

Since you have already read the data then create a new variable

Data want;

set have;

if length(ZIP) = 5 then ZipNum=input(Zip,5.);

run;

May do what you want.

One issue I would consider would be if the data source treated zips such as 00009 as numeric and exported them as fewer than 5 digits.

Contributor
Posts: 54

## Re: 5 digit zip code

Well, two questions. I tried that and got

"NOTE: Invalid numeric data, ZIP='L2J1P' , at line 361 column 4."  So it seems to be having problems in that Zip is not numeric to begin with.

But you also wrote "since you already read the data". I can go back to the original data, reading it in with the input code,

input

@133 zip \$char5.

Are you suggesting a way to modify that?

Thanks

Solution
‎10-08-2015 09:23 AM
SAS Employee
Posts: 170

## Re: 5 digit zip code

If you have 5 character alphabetic characters, can't you use the following?

if (anyalpha(ZIP)=0 and length(ZIP) = 5) then ZipNum=input(Zip,5.);

Super User
Posts: 5,424

## Re: 5 digit zip code

Yes, of course, it has characters.

But you wish to do this operation anyway right?

So you can suppress the NOTE by using the colon modifier in the input statement.

Are you absolutely sure that chars are invalid data? If yes, you should report the erroneous records to the data source to have them fix them.

Data never sleeps
Posts: 3,156

## Re: 5 digit zip code

Why you need to treat it like a number? It is a zip code, you are not doing any arithmetic operation on it, so keep it Char. It will make your life a lot easier when dealing zip codes start with 0s, such as those from NJ.

```data test;
infile cards truncover;
input zip \$10.;
new_zip=ifc(prxmatch('/^\d{5}\$/',strip(zip))>0, zip, '');
cards;
asdhg
78659
0987689
89ui
26543
u
.
;```
Posts: 4,173

## Re: 5 digit zip code

Just variations to what @Haikuo posted.

``````data test;
infile cards truncover;
input zip \$10.;
length new_zip1 new_zip2 \$5;
new_zip=ifc(prxmatch('/^\d{5}\$/',strip(zip))>0, zip, '');
if lengthn(zip)=5 and notdigit(strip(zip))=0 then new_zip1=zip;
if prxmatch('/^\d{5} *\$/',zip)>0 then new_zip2=zip;
cards;
asdhg
78659
0987689
89ui
26543
u
.
;``````
Contributor
Posts: 54

## 5 digit zip code

Well, first, I want to delete any zip code that isn't 5 characters long. All zip codes must be 5 characters long (which can include leading 0). If it's character, that's fine, for this part, and this part looks easy to do.

Second, I want to delete any values that are not all numbers. All zip codes must be numerical, and cannot include any non numerical characters. This is the part that seems the most difficult. I don't want to just supress notes or messages about errors.

Thanks.

Gene

Posts: 4,173