I feel silly asking this question but for some reason, I am having issue with this particular dataset when converting character to numeric. See attached for the data I am running this on. This problem seems to only be happening with PROC IMPORT.
If you don't want to download the file, the basic problem is that when importing an XLSX for columns with numeric values but the numeric values have a space after it, I am unable to convert the values to numeric.
PROC IMPORT DATAFILE="have.xlsx" DBMS=xlsx REPLACE OUT=have;
RUN;
DATA want;
SET have;
FORMAT col1 col2;
INFORMAT col1 col2;
col1_nothing=INPUT(col1,8.);
col1_compress=INPUT(COMPRESS(col1),8.);
col1_strip=INPUT(STRIP(col1),8.);
col1_trim=INPUT(TRIM(col1),8.);
RUN;
The output is below:
All of the cells that are corresponding with a cell that is a numeric value followed by a space have an end result of numeric missing. Performing COMPRESS, TRIM, STRIP does not remove this space. This error does not occur when creating a dataset within SAS and only when importing with PROC IMPORT
DATA test;
test="0.14 ";
test2=INPUT(test,8.);
RUN;I know the solution is just editing the dataset but ideally, I'd like to fix this without manually editing data I am given for my position.
This error is occurring when run on SAS EG 8.6, SAS 9.4
Also just a side question, I left in a COL2 that when PROC IMPORT'ed, 0.07 gets converted to "7.0000000000000007E-2". I assume this is just floating point but not sure why this happens if it's being read as character.
I did not download your spreadsheet. You can use these instructions to convert the output from PROC IMPORT into data step code we can use.
Most likely, there is a non-printable (invisible) character(s) in F1 and thus the contents of F1 cannot be converted to a number. You can use the COMPRESS function to remove characters you don't want, or keep characters you do want. In this case, we want to keep digits, the dot, and possibly commas and a hyphen to indicate a negative number. So this will do the job:
data want;
set have;
f1_numeric=input(compress(f1,'.,-','dk'),best32.);
run;
I did not download your spreadsheet. You can use these instructions to convert the output from PROC IMPORT into data step code we can use.
Most likely, there is a non-printable (invisible) character(s) in F1 and thus the contents of F1 cannot be converted to a number. You can use the COMPRESS function to remove characters you don't want, or keep characters you do want. In this case, we want to keep digits, the dot, and possibly commas and a hyphen to indicate a negative number. So this will do the job:
data want;
set have;
f1_numeric=input(compress(f1,'.,-','dk'),best32.);
run;
Bah, running that code on my work PC gives me an error for insufficient authorization. I'll figure it out more next time I need help, this is helpful. I was surprised that this isn't built in on the SAS forums and did a print screen of the output (and subsequently input).
Just ran that snippet you gave me on the compress and your hunch was correct. Thank you! I should've been ticked off on that possibility when I went to created a test dataset with "0.14 " but excel automatically formatted it as numeric without the leading blank so there was definitely something other than just a space in the original dataset.
@hubernomiddle wrote:
Also just a side question, I left in a COL2 that when PROC IMPORT'ed, 0.07 gets converted to "7.0000000000000007E-2". I assume this is just floating point but not sure why this happens if it's being read as character.
The reason is that the numeric value is given as
<v>7.0000000000000007E-2</v>
in the XML code which (in zipped form) is contained in your .xlsx file. I see also
<v>2.0099999999999998</v>
and
<v>1.1000000000000001</v>
in there.
In all three cases the small deviations from 0.07, 2.01 and 1.1, respectively, just reflect the unavoidable rounding errors of the binary floating-point representations of 0.07, 2.01 and 1.1 with 52 mantissa bits.
If those values are read as numeric values (using the INPUT statement, the INPUT function or assignment statements with those literals), SAS will regard them exactly equal to 0.07, 2.01 and 1.1, respectively, without further rounding -- which you could apply otherwise:
data test;
input x;
put x hex16.;
cards;
7.0000000000000007E-2
0.07
2.0099999999999998
2.01
1.1000000000000001
1.1
;
Log:
3FB1EB851EB851EC 3FB1EB851EB851EC 4000147AE147AE14 4000147AE147AE14 3FF199999999999A 3FF199999999999A
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.