I've got numeric zip code values of various length. I used z format in the example below. It worked for value that was 2 digits but I got weird results for the ones longer than 5. Is there a better way?
00023 |
778E4 |
001E7 |
First question is what do you want as the result? Should the longer values be rounded:
77778
10000
Or should they be truncated:
77777
99999
Assuming they should be truncated, you would need to change the values before applying a format (such as z5.):
data want;
set sample;
if num_zip > 99999 then do until (num_zip <= 99999);
*Yes, there are more compact ways to do this;
num_zip = int(num_zip / 10);
end;
zipcode = put(num_zip, z5.);
run;
First question is what do you want as the result? Should the longer values be rounded:
77778
10000
Or should they be truncated:
77777
99999
Assuming they should be truncated, you would need to change the values before applying a format (such as z5.):
data want;
set sample;
if num_zip > 99999 then do until (num_zip <= 99999);
*Yes, there are more compact ways to do this;
num_zip = int(num_zip / 10);
end;
zipcode = put(num_zip, z5.);
run;
Several of the SAS formats will attempt to yield "something" that comes close to a representation of a value when your width doesn't allow full display of the value. Presence of an E in numeric value means scientific notation has been applied. So when you attempt to fit a value with more than 5 digits, such as 7777777 into 5 characters then you get things like 778e4 because that's is about as close as you can get to the 7 digits using 5 characters. You would get something similar with small values like 0.0000000001234 only the exponent, the part after E would be negative.
Question, if you only want 5 character zips why did you bother to read in more than 5 digits? You won't have the problem with
data sample; input num_zip 5.; format num_zip z5.; datalines; 23 7777777 999999999 ; run;
Personally I never read ZIP codes as numeric because the modern zip codes may be 10 characters long ddddd-dddd with a dash in the middle (and I have them). Which would read incorrectly in with a numeric informat.
Even if they had been imported as character, there would still be a problem, since we need different methods to convert them to five digit zips based on the length of original variable.
Hi @Batman , when you use z. format to display a number, it will be displayed in a way like this: 1) the number will start with several 0, depending on how many digits the z. format has, 2) to determine how many digits the z. format has, use a number after z before the dot ., for example, z5. means it has five digits with several 0 before your number, if your number has 2 digits, it will has three 0 before it, if you number has 4 digits, it will has one 0 before it, and 3) if you use z5. for a number longer than 5 digits, it won't work. So to solve your question, you need a number larger than 5 in the z. format, for example, z10., which gives more space to display your number. The code and output is as follows:
data sample;
input num_zip 8.;
format num_zip 9.;
datalines;
23
7777777
999999999
;
run;
data convert;
set sample;
zipcode=put(num_zip,z10.);
run;
proc print data=convert noobs;
var zipcode;
run;
US Zip codes? Those are either 5 digits or 5+4 digits. So you can probably just test if the value is larger than 99,999 to tell which type you have.
20 data sample;
21 input num_zip ;
22 if num_zip > 99999 then zip=put(int(num_zip/10000),Z5.);
23 else zip=put(num_zip,Z5.);
24 put (_all_) (=);
25 datalines;
num_zip=23 zip=00023
num_zip=7777777 zip=00777
num_zip=999999999 zip=99999
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.