Solved: reading data from Excel (xlsx) to SAS-character/numeric exchange with ...

Emma_at_SAS · Posted 03-01-2021 03:55 PM

I read a .xlsx data in SAS using proc import:

options validvarname=v7;

proc import datafile="&proj_loc\my_data.xlsx"

dbms=xlsx

replace

out=SAS_data;

run;

One of my numeric columns with 2 decimal places in the Excel file is read as a character variable in SAS.

I switched that to numeric using varnum=input(varchar, best14.);

The problem is some values are read with exponential format during the proc import and the best14. does not fix them.

For example, a 0.07 is imported as 7.0000000000000007E-2 and in the var_num it is read as only 7.

I appreciate it if you have any suggestions to fix this.

Thanks

Tom · Posted 03-01-2021 08:05 PM

If the column in the Excel file has mixed character and numeric cells then the variable will be character.

If you want to convert a string to a number just use the normal informat. The maximum width that the numeric informat can handle is 32 bytes. The INPUT() function does not care if the width used in the informat is wider than the length of the string being read. But leading spaces might have an inpact. The width used for the informat in your code (14) is shorter than the length of the example text (21) in your question so you did not read the E-02 part.

varnum=input(left(varchar), 32.);

Note that BEST as an INformat is just treated as an alias for the normal numeric informat.

View solution in original post

Reeza · Posted 03-01-2021 04:08 PM

If a variable is in exponential format, check the variable type/format. If it's a numeric type you just need to change the format. If it's a character variable, you need to provide the correct type to the INPUT function.

format variableName 12.2;

That should show the value with 2 decimal places, if possible.

ballardw · Posted 03-01-2021 05:41 PM

How do you know the starting value is 0.07 and not 0.070000000000002 or similar in Excel? Most default numeric cells in Excel will use 2 decimals for display unless you force otherwise.

If the value is "read as character" then you would not see any exponential informat or format or exponential appearing value. You would see something like $14. as the informat/format in the variable characteristics.

Having a hard time seeing how "I switched that to numeric using varnum=input(varchar, best14.);" has anything to do with Proc import and an exponential value. If the value was "read with an exponential format (sic)" in the first place the variable is numeric and using Input(<anything here referencing that imported value>,best14) would have an IMPLICIT conversion from the numeric value to character to use the Input function. And that implicit conversion is likely the wrong format. Did your code show any messages like:

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).

then the existing format would have been used to convert the numeric value to character prior to use of input. Which is very likely to have an incorrect result since most import steps will end up with a BEST12 format. So the result would be truncated to 12 characters before the input.

Emma_at_SAS · Posted 03-02-2021 10:36 AM

Thank you! I checked and my number is really 0.07

Tom · Posted 03-01-2021 08:05 PM

If the column in the Excel file has mixed character and numeric cells then the variable will be character.

If you want to convert a string to a number just use the normal informat. The maximum width that the numeric informat can handle is 32 bytes. The INPUT() function does not care if the width used in the informat is wider than the length of the string being read. But leading spaces might have an inpact. The width used for the informat in your code (14) is shorter than the length of the example text (21) in your question so you did not read the E-02 part.

varnum=input(left(varchar), 32.);

Note that BEST as an INformat is just treated as an alias for the normal numeric informat.

Emma_at_SAS · Posted 03-02-2021 10:35 AM

Thank you--Your suggested code now reads my data as 0.07. I checked my excel file and the 20 decimals were all 0 after 7.
I was receiving a note about "invalid arguments" and I added "??" before 32. and it solved the issue.
varnum=input(left(varchar), ?? 32.);

Thanks

Reeza · Posted 03-02-2021 12:00 PM

@Emma_at_SAS wrote:
Thank you--Your suggested code now reads my data as 0.07. I checked my excel file and the 20 decimals were all 0 after 7.
I was receiving a note about "invalid arguments" and I added "??" before 32. and it solved the issue.
varnum=input(left(varchar), ?? 32.);

Thanks

This will suppress any warnings. So if you have values that are wrong for any reason, for example someone typed in "eight" instead of 8, you won't notice that they're missing. Is that what you want to happen?

Emma_at_SAS · Posted 03-02-2021 12:19 PM

Thank you, Reeza, for mentioning that. No, that was not what I wanted to do! I will go back and check the warnings!

Emma_at_SAS · Posted 03-02-2021 01:38 PM

My SAS code is

data want;
set have;
var1_num=input(left(var1),32.);
var2_num=input(left(var2), 32.);
run;

The NOTE in the SAS log says:

NOTE: Invalid argument to function INPUT at line 1007 column 9.
NOTE: Invalid argument to function INPUT at line 1008 column 10

When I run the input lines separately the line and column in the NOTE changes. I cannot match these lines and columns to my data. Is this something I have to check? Other thoughts?

Tom · Posted 03-02-2021 02:40 PM

@Emma_at_SAS wrote:

My SAS code is

data want;
set have;
var1_num=input(left(var1),32.);
var2_num=input(left(var2), 32.);
run;

The NOTE in the SAS log says:

NOTE: Invalid argument to function INPUT at line 1007 column 9.
NOTE: Invalid argument to function INPUT at line 1008 column 10

When I run the input lines separately the line and column in the NOTE changes. I cannot match these lines and columns to my data. Is this something I have to check? Other thoughts?

Please look at your SAS log more carefully. SAS will show you in detail what the offending data. For example try this program.

data want;
  input var1 $ var2 $ ;
  var1_num=input(left(var1),32.);
  var2_num=input(left(var2), 32.);
cards;
12.3 4.5 
. .
abc def 
;

Resulting LOG

1735  data want;
1736    input var1 $ var2 $ ;
1737    var1_num=input(left(var1),32.);
1738    var2_num=input(left(var2), 32.);
1739  cards;

NOTE: Invalid argument to function INPUT at line 1737 column 12.
NOTE: Invalid argument to function INPUT at line 1738 column 12.
RULE:       ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+
1742        abc def
var1=abc var2=def var1_num=. var2_num=. _ERROR_=1 _N_=3
NOTE: Mathematical operations could not be performed at the following places. The results of the
      operations have been set to missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 1737:12   1 at 1738:12
NOTE: The data set WORK.WANT has 3 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds

Notice how it actually lists the value of VAR1 and VAR2 that caused the error. And since my example is reading the data from text instead of using a SET statement it also lists the actual line of text read for that observation that generated the errors.

Emma_at_SAS · Posted 03-02-2021 04:48 PM

Thank you very much, Tom. Your post was very helpful!
In my Excel data, I have NA for the missing observations. Because of these character observations, SAS reads my column as a character variable. Then, when I convert the variable from character to numeric, SAS creates a NOTE for invalid arguments, which are the NA's. Because that is exactly what I need (to convert NA's to missing) I do not need to do anything about it.

Thanks

Reeza · Posted 03-02-2021 05:11 PM

What you should do is conditionally handle those so you don't get a message.

if variableValue = "NA" then newValue=.;
else newValue = input(variableValue, best.);

Or use the IFN/IFC() function which is slightly more compact. But it's much better way to handle this issue.

Emma_at_SAS · Posted 03-02-2021 05:00 PM

May I ask that what is the role of LEFT in var1_num=input(left(var1),32.)
and how this code is different from var1_num=input(var1,best32.)?

Thanks

Reeza · Posted 03-02-2021 05:06 PM

Left() aligns a character value to the left, if it has extra spaces this can be necessary or the format may truncate the data if it's got 32 leading spaces.

Documentation link for LEFT() function:
https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=lefunctionsref&docsetTarg...

FYI - I recommend bookmarking the Functions by Category and formats by category pages in SAS documentation. They get used quite often and are a great resource.

https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=lefunctionsref&docsetTarg...

reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Re: reading data from Excel (xlsx) to SAS-character/numeric exchange with correct decimals

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!