I've got numeric zip code values of various length. I used z format in the example below. It worked for value that was 2 digits but I got weird results for the ones longer than 5. Is there a better way?
00023 |
778E4 |
001E7 |
First question is what do you want as the result? Should the longer values be rounded:
77778
10000
Or should they be truncated:
77777
99999
Assuming they should be truncated, you would need to change the values before applying a format (such as z5.):
data want;
set sample;
if num_zip > 99999 then do until (num_zip <= 99999);
*Yes, there are more compact ways to do this;
num_zip = int(num_zip / 10);
end;
zipcode = put(num_zip, z5.);
run;
First question is what do you want as the result? Should the longer values be rounded:
77778
10000
Or should they be truncated:
77777
99999
Assuming they should be truncated, you would need to change the values before applying a format (such as z5.):
data want;
set sample;
if num_zip > 99999 then do until (num_zip <= 99999);
*Yes, there are more compact ways to do this;
num_zip = int(num_zip / 10);
end;
zipcode = put(num_zip, z5.);
run;
Several of the SAS formats will attempt to yield "something" that comes close to a representation of a value when your width doesn't allow full display of the value. Presence of an E in numeric value means scientific notation has been applied. So when you attempt to fit a value with more than 5 digits, such as 7777777 into 5 characters then you get things like 778e4 because that's is about as close as you can get to the 7 digits using 5 characters. You would get something similar with small values like 0.0000000001234 only the exponent, the part after E would be negative.
Question, if you only want 5 character zips why did you bother to read in more than 5 digits? You won't have the problem with
data sample; input num_zip 5.; format num_zip z5.; datalines; 23 7777777 999999999 ; run;
Personally I never read ZIP codes as numeric because the modern zip codes may be 10 characters long ddddd-dddd with a dash in the middle (and I have them). Which would read incorrectly in with a numeric informat.
Even if they had been imported as character, there would still be a problem, since we need different methods to convert them to five digit zips based on the length of original variable.
Hi @Batman , when you use z. format to display a number, it will be displayed in a way like this: 1) the number will start with several 0, depending on how many digits the z. format has, 2) to determine how many digits the z. format has, use a number after z before the dot ., for example, z5. means it has five digits with several 0 before your number, if your number has 2 digits, it will has three 0 before it, if you number has 4 digits, it will has one 0 before it, and 3) if you use z5. for a number longer than 5 digits, it won't work. So to solve your question, you need a number larger than 5 in the z. format, for example, z10., which gives more space to display your number. The code and output is as follows:
data sample;
input num_zip 8.;
format num_zip 9.;
datalines;
23
7777777
999999999
;
run;
data convert;
set sample;
zipcode=put(num_zip,z10.);
run;
proc print data=convert noobs;
var zipcode;
run;
US Zip codes? Those are either 5 digits or 5+4 digits. So you can probably just test if the value is larger than 99,999 to tell which type you have.
20 data sample;
21 input num_zip ;
22 if num_zip > 99999 then zip=put(int(num_zip/10000),Z5.);
23 else zip=put(num_zip,Z5.);
24 put (_all_) (=);
25 datalines;
num_zip=23 zip=00023
num_zip=7777777 zip=00777
num_zip=999999999 zip=99999
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.