Statistical programming, matrix languages, and more

Syntax for converting character data to numbers while keeping leading zeros

Reply
Contributor
Posts: 24

Syntax for converting character data to numbers while keeping leading zeros

I am importing some data from an Excel spreadsheet to a SAS dataset, whence it will be read to a matrix in IML. One of the fields is an ID number or code similar to a ZIP code; it will not be used in any calculations in IML but will be needed later on as a key field for joining the results of the calculation to a table in other software. Therefore, I need it to remain unmodified.

 

When data from Excel are imported to the SAS dataset, the field still appears. However, SAS apparently treats the "zip code" field as text (because of the leading zeros?), and all other fields in the data are numeric. (The "zip code" is left-aligned and all other fields are right-aligned when I view the dataset). Therefore, when the data are read to the IML matrix, the "zip code" is left out.

 

Am I right that the text nature of the "zip code" field is the reason why it fails to show up in IML? If so, what is the easiest way to convert it to a number without losing its leading zeros? I know that this has already been discussed here, but what does that code do? Am I right that it writes the contents of one SAS dataset to another in which the field in question is numeric?

Esteemed Advisor
Posts: 6,646

Re: Syntax for converting character data to numbers while keeping leading zeros


dvtarasov wrote: I know that this has already been discussed here, but what does that code do? Am I right that it writes the contents of one SAS dataset to another in which the field in question is numeric?

You are completely right. But since this is most basic SAS code, I sincerely suggest you acquaint yourself with the most basic SAS techniques before delving head-on into something as advanced as IML.

 

What transformation will you be doing in IML that requires the presence of what is basically arbitrary text?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 4,973

Re: Syntax for converting character data to numbers while keeping leading zeros

It is possible that some zip code values are not convertible, and that's why importing the data treats them as character.  For example, you may have some nine-digit zip codes with a dash after the first five digits.

 

At any rate, you can convert to numeric (as a separate variable):

 

newvar = input(zipcode, 5.);

 

You will lose any information after the 5th character by doing that (hence any 9-digit zip codes will become 5-digit zip codes).

 

There is no way to store a leading zero in a numeric, in SAS.  SAS is not storing a set of digits, but is storing the value.  You can print a numeric with leading zeros:

 

format newvar z5.;

 

You can also convert back to character at a later stage:

 

zipcode = put(newvar, z5.);

 

 

Contributor
Posts: 24

Re: Syntax for converting character data to numbers while keeping leading zeros

Are the newvar = input(...) and the zipcode= put (newvar, z5.); statements part of the PROC DATA step?

Also, will the codefield = put (newvar, z12.); statement add a leading zero to an 11-digit number?
Respected Advisor
Posts: 4,973

Re: Syntax for converting character data to numbers while keeping leading zeros

All the statements that I suggested would be part of a DATA step.

 

And yes, you can create a 12-digit character field (complete with leading zeros as needed) by applying:

 

codefield = put(newvar, z12.);

 

 

Grand Advisor
Posts: 9,576

Re: Syntax for converting character data to numbers while keeping leading zeros

In IML, if you want convert it to a number without losing its leading zeros, 

You can use NUM() to convert it to a numeric value and assign a format Z8. to it to keep leading zeros.

 

data have;
input zip_code $ a b;
cards;
000123 3 4
001234 8 9
;
run;
proc iml;
use have;
read all var _num_ into x;
close;
use have;
read all var{zip_code};
close;
new_zip_code=num(zip_code);
mattrib new_zip_code format=z8.;
print new_zip_code x;
quit;
Ask a Question
Discussion stats
  • 5 replies
  • 528 views
  • 0 likes
  • 4 in conversation