BookmarkSubscribeRSS Feed
marleeakerson
Calcite | Level 5

Hello, 

 

I am using this piece of code to convert a character variable to numeric: 

 

data want;
set have;
Zip_Code=input(Zip_Code, 10.);
run;

 

but the variable still comes out character. 

marleeakerson_0-1600878391770.png

 

 

Any ideas? 

12 REPLIES 12
PaigeMiller
Diamond | Level 26

You can't change a variable from character to numeric. Zip_Code is initially character, no matter what you do to it, it will be character. (I suspect there are errors or warnings in the LOG as well)

 

You need to create a NEW numeric variable (with a different name) as follows

 

Zip_Code1=input(Zip_Code, 10.);

 

--
Paige Miller
ballardw
Super User

Once a variable is defined as character you cannot change that. You would have to create a new variable such as:

data want;
   set have;
   Zip_num=input(Zip_Code, 10.);
run;

 

Now, why do you need a numeric value? Arithmetic with Zip codes is pretty meaningless.

Plus your use of a 10. informat makes me suspect that you may have some +4 Zips like 12345-6789 in your data. Those will not convert to numeric at all without some serious manipulation.

jimbarbour
Meteorite | Level 14

@ballardw has a good point.  If you convert 99999-9999 to numeric, you'll get a missing value and in essence lose your data.

 

If you really, truly need numeric, you could do this:

 

data	have;
	LENGTH	Zip_Code	$10.;
	INFILE	Datalines;
	INPUT	Zip_Code	$	&;
Datalines;
91024-4444
97805
91024-
99999-9999
91111
92222 2222
;
run;

data want;
   set have;
   Zip_num=input(COMPRESS(Zip_Code,,'kd'), 10.);
run;

 

 

The COMPRESS will take care of dashes or blanks.

 

Results:

jimbarbour_1-1600880785296.png

 

Jim

CurtisMackWSIPP
Lapis Lazuli | Level 10

SAS variables are of fixed type.  They cannot be changed.  You must create a new variable.

 

Try this approach:

 

data want(drop = Zip_Code_str);
set have(rename = (Zip_Code = Zip_Code_str);
Zip_Code=input(Zip_Code_str, 10.);
run;
CurtisMackWSIPP
Lapis Lazuli | Level 10

@Kurt_Bremser wrote:

A ZIP code is a code, not a number used for calculations. Keep it as character.


Unless you want to save 37 bytes per row.

Kurt_Bremser
Super User

We don't know the defined length of the variable, but if it was 10, the saving is only 2 per observation.

Empty space in character variables can be better dealt with with the COMPRESS option.

And how do you deal with codes that have leading zeroes in a consistent, failsafe way?

CurtisMackWSIPP
Lapis Lazuli | Level 10

My mistake on the math.  But ZIP codes only need numeric length 3, and they also only need 5 characters. So you do only save 2 bytes, but for different reasons.

But why would you need to worry about leading zeros except when outputing results.  They are just integers so they are just as unique without the leading zero.

ballardw
Super User

@CurtisMackWSIPP wrote:

My mistake on the math.  But ZIP codes only need numeric length 3, and they also only need 5 characters. So you do only save 2 bytes, but for different reasons.

But why would you need to worry about leading zeros except when outputing results.  They are just integers so they are just as unique without the leading zero.


Zip+4 are not "integers". the position of the required dash character means that you cannot just remove it and treat the remainder as "integer". if you have a zip of 00004 and another of 00004-1234 and attempt to treat them as integers then 4 and 41234 are not going to go to the same state for postal services.

Kurt_Bremser
Super User

And what about those extended zip codes @ballardw mentioned?

 

No, several hundred years of SAS experience here on the communities will tell you that codes are best kept as character.

CurtisMackWSIPP
Lapis Lazuli | Level 10

The poster didn't mention ZIP+4 Codes, and in my experience they are typically stored in a separate column. My user account does not reflect my 30 years of SAS experience.  I specializing in GIS have a lot of experience with ZIP Codes.  I unusually store them as characters strings, but know that they can work just fine as integers.  For some situations you need them as integers.  Particularly when you are working across software platforms. 

 

And then there is the case where you don't control the data source or are matching an output requirement.

ballardw
Super User

@CurtisMackWSIPP wrote:

The poster didn't mention ZIP+4 Codes, and in my experience they are typically stored in a separate column. My user account does not reflect my 30 years of SAS experience.  I specializing in GIS have a lot of experience with ZIP Codes.  I unusually store them as characters strings, but know that they can work just fine as integers.  For some situations you need them as integers.  Particularly when you are working across software platforms. 

 

And then there is the case where you don't control the data source or are matching an output requirement.


The reason I consider Zip+4 is the specific use of an F10. informat. If the values are only basic 5 digits then why pick an F10 informat? Might be habit or looking at the values and seeing the 10 columns a Zip+4 takes. 

 

If your data providers a clean enough to have the +4 in a separate column. I have to deal with "data" that will be in Excel and have a column that should be Zip codes in currency or date formats. At which point I am not terribly confident that going back to a general format actual has the original value.

 

I did ask why. If we get a response to match some other requirement then see what the actual match use has to be. I can join data of different types sometimes if all the details are present.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1639 views
  • 3 likes
  • 6 in conversation