10-10-2017 03:17 PM
I imported parcel ids for different counties from xlsx files to SAS files. One county shows the parcel ids as numeric data and uses normal notation. One county shows the parcel ids as character data and uses standard notation, and the last county shows the parcel ids as numeric data and has them in scientific notation.
Question 1: Is there a quick way I can change the scientific notation to standard notation without having to reimport the files?
Question 2: Is there a quick way I can change the parcel ids from numeric to character data without losing digits or having it reformat?
10-10-2017 03:49 PM
The parcel code is 384344160600502-- but the last 2 digits are excluded.
My assumption is that the lengths differ because of zeros and whether or not the notation considers digits to be significant.
10-10-2017 05:01 PM
Looks like your IDs are up to 16 digits long. In that case you MUST import them as character data as SAS numeric variables can only only hold 15 digits accurately.
10-10-2017 03:31 PM
Look at the variable TYPE and FORMAT.
If it's just the format you can change that easily, use a FORMAT statement.
If its the type you need to convert it, using either PUT or INPUT.
10-10-2017 03:39 PM - edited 10-10-2017 03:47 PM
If the number of spreadsheets is small I would convert all of the parcel ID columns to text in Excel then re-import them into SAS.
A more reliable way would be to save the Excel sheets as CSVs, then import those into SAS. Here you can use a DATA step to enforce the IDs to be character.
I idea is to get the parcel ID columns all importing as character to begin with so you don't need to manipulate them afterwards. To me this is far preferable to dealing with unpredictable data types in SAS.
10-10-2017 03:42 PM
I initially imported them as CSVs but had a problem with a different code that was being imported as numeric in some and character in others. I converted them all to xlsx and reimported them. The problem, I think, is I am downloading data from different counties and they have variances in how they store the data. I may have to go back to the beginning and do it all over again though.
10-10-2017 08:57 PM
Just did a test with Excel 2013 and I haven't been able to impute a number with 22 digits with full precision into an Excel cell.
I have been able to impute all the digits when defining the cell as TEXT.
When using the TEXT format on the Excel cell then the following worked like a charm:
libname pid xlsx 'c:\temp\parcel_id.xlsx';
10-10-2017 03:45 PM
If you are using Proc Import to read Excel files you basically don't have much in the way of control over the guessing process Proc Import uses.
For consistent data you will likely need to use a process with more control. One way is to save the Excel data as CSV files and use a common data step program to read the data changing the source file and result data set names. That way you can control the types of variables and ensure that the length of character variables stay the same.
Changing numeric to character would involve something like:
set old (rename=(parcel_id = old_id));
parceled = put(old_id, best10. -L);
assuming 10 characters is enough to hold the character version, if not increase the 10 to the desired length. ONLY use this if parcel_id, or whatever your actual variable name may be is numeric without decimals.
10-11-2017 02:08 PM
So I went ahead and reimported them all as CSVs and that solved the problem with the parcel_id. However, I remember now why I switched to XLSX in the first place. When importing as CSVs other data lost digits in other places.
Here is a screen shot of my SAS file from the CSV import:
Here is a screen shot from the SAS file of my XLSX import:
The CSV import has them all as Character and only keeps the first digit. The XLSX file has them as numeric and has the full value. All of them should be character except for sale_prc1. I am thinking all I need to do is some sort of format change, but am not quite sure how. I want to change the sale_prc1 to a numeric variable and accommodate for up to 10 digits. I want to keep those other variables at character but get all the digits to show.