BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Emma_at_SAS
Lapis Lazuli | Level 10
I read a .xlsx data in SAS using proc import:
 
options validvarname=v7; 
proc import datafile="&proj_loc\my_data.xlsx"
dbms=xlsx
replace
out=SAS_data;
run;
 
One of my numeric columns with 2 decimal places in the Excel file is read as a character variable in SAS.
I switched that to numeric using varnum=input(var​char, best14.);
 
The problem is some values are read with exponential format during the proc import and the best14. does not fix them.
For example, a 0.07 is imported as 7.0000000000000007E-2 and in the var_num it is read as only 7.
 
I appreciate it if you have any suggestions to fix this.
 
Thanks
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If the column in the Excel file has mixed character and numeric cells then the variable will be character.

If you want to convert a string to a number just use the normal informat.  The maximum width that the numeric informat can handle is 32 bytes.  The INPUT() function does not care if the width used in the informat is wider than the length of the string being read.  But leading spaces might have an inpact.  The width used for the informat in your code (14) is shorter than the length of the example text (21) in your question so you did not read the E-02 part.

 

varnum=input(left(var​char), 32.);

 

Note that BEST as an INformat is just treated as an alias for the normal numeric informat.

View solution in original post

13 REPLIES 13
Reeza
Super User
If a variable is in exponential format, check the variable type/format. If it's a numeric type you just need to change the format. If it's a character variable, you need to provide the correct type to the INPUT function.

format variableName 12.2;

That should show the value with 2 decimal places, if possible.
ballardw
Super User

How do you know the starting value is 0.07 and not 0.070000000000002 or similar in Excel? Most default numeric cells in Excel will use 2 decimals for display unless you force otherwise.

If the value is "read as character" then you would not see any exponential informat or format or exponential appearing value. You would see something like $14. as the informat/format in the variable characteristics.

 

Having a hard time seeing how "I switched that to numeric using varnum=input(var​char, best14.);" has anything to do with Proc import and an exponential value. If the value was "read with an exponential format (sic)" in the first place the variable is numeric and using Input(<anything here referencing that imported value>,best14) would have an IMPLICIT conversion from the numeric value to character to use the Input function. And that implicit conversion is likely the wrong format. Did your code show any messages like:

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).

then the existing format would have been used to convert the numeric value to character prior to use of input. Which is very likely to have an incorrect result since most import steps will end up with a BEST12 format. So the result would be truncated to 12 characters before the input.

 

 

Emma_at_SAS
Lapis Lazuli | Level 10
Thank you! I checked and my number is really 0.07
Tom
Super User Tom
Super User

If the column in the Excel file has mixed character and numeric cells then the variable will be character.

If you want to convert a string to a number just use the normal informat.  The maximum width that the numeric informat can handle is 32 bytes.  The INPUT() function does not care if the width used in the informat is wider than the length of the string being read.  But leading spaces might have an inpact.  The width used for the informat in your code (14) is shorter than the length of the example text (21) in your question so you did not read the E-02 part.

 

varnum=input(left(var​char), 32.);

 

Note that BEST as an INformat is just treated as an alias for the normal numeric informat.

Emma_at_SAS
Lapis Lazuli | Level 10
Thank you--Your suggested code now reads my data as 0.07. I checked my excel file and the 20 decimals were all 0 after 7.
I was receiving a note about "invalid arguments" and I added "??" before 32. and it solved the issue.
varnum=input(left(var​char), ?? 32.);

Thanks
Reeza
Super User

@Emma_at_SAS wrote:
Thank you--Your suggested code now reads my data as 0.07. I checked my excel file and the 20 decimals were all 0 after 7.
I was receiving a note about "invalid arguments" and I added "??" before 32. and it solved the issue.
varnum=input(left(var​char), ?? 32.);

Thanks

This will suppress any warnings. So if you have values that are wrong for any reason, for example someone typed in "eight" instead of 8, you won't notice that they're missing. Is that what you want to happen?

Emma_at_SAS
Lapis Lazuli | Level 10

Thank you, Reeza, for mentioning that. No, that was not what I wanted to do! I will go back and check the warnings!

Emma_at_SAS
Lapis Lazuli | Level 10

My SAS code is

data want;
set have;
var1_num=input(left(var1),32.);
var2_num=input(left(var2), 32.);
run;

 

The NOTE in the SAS log says: 

NOTE: Invalid argument to function INPUT at line 1007 column 9.
NOTE: Invalid argument to function INPUT at line 1008 column 10

 

When I run the input lines separately the line and column in the NOTE changes. I cannot match these lines and columns to my data. Is this something I have to check? Other thoughts?

Tom
Super User Tom
Super User

@Emma_at_SAS wrote:

My SAS code is

data want;
set have;
var1_num=input(left(var1),32.);
var2_num=input(left(var2), 32.);
run;

 

The NOTE in the SAS log says: 

NOTE: Invalid argument to function INPUT at line 1007 column 9.
NOTE: Invalid argument to function INPUT at line 1008 column 10

 

When I run the input lines separately the line and column in the NOTE changes. I cannot match these lines and columns to my data. Is this something I have to check? Other thoughts?


Please look at your SAS log more carefully.  SAS will show you in detail what the offending data.  For example try this program.

data want;
  input var1 $ var2 $ ;
  var1_num=input(left(var1),32.);
  var2_num=input(left(var2), 32.);
cards;
12.3 4.5 
. .
abc def 
;

Resulting LOG

1735  data want;
1736    input var1 $ var2 $ ;
1737    var1_num=input(left(var1),32.);
1738    var2_num=input(left(var2), 32.);
1739  cards;

NOTE: Invalid argument to function INPUT at line 1737 column 12.
NOTE: Invalid argument to function INPUT at line 1738 column 12.
RULE:       ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+
1742        abc def
var1=abc var2=def var1_num=. var2_num=. _ERROR_=1 _N_=3
NOTE: Mathematical operations could not be performed at the following places. The results of the
      operations have been set to missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 1737:12   1 at 1738:12
NOTE: The data set WORK.WANT has 3 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds

Notice how it actually lists the value of VAR1 and VAR2 that caused the error.  And since my example is reading the data from text instead of using a SET statement it also lists the actual line of text read for that observation that generated the errors.

Emma_at_SAS
Lapis Lazuli | Level 10
Thank you very much, Tom. Your post was very helpful!
In my Excel data, I have NA for the missing observations. Because of these character observations, SAS reads my column as a character variable. Then, when I convert the variable from character to numeric, SAS creates a NOTE for invalid arguments, which are the NA's. Because that is exactly what I need (to convert NA's to missing) I do not need to do anything about it.

Thanks

Reeza
Super User
What you should do is conditionally handle those so you don't get a message.

if variableValue = "NA" then newValue=.;
else newValue = input(variableValue, best.);

Or use the IFN/IFC() function which is slightly more compact. But it's much better way to handle this issue.
Emma_at_SAS
Lapis Lazuli | Level 10
May I ask that what is the role of LEFT in var1_num=input(left(var1),32.)
and how this code is different from var1_num=input(var1,best32.)?

Thanks
Reeza
Super User
Left() aligns a character value to the left, if it has extra spaces this can be necessary or the format may truncate the data if it's got 32 leading spaces.

Documentation link for LEFT() function:
https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=lefunctionsref&docsetTarg...



FYI - I recommend bookmarking the Functions by Category and formats by category pages in SAS documentation. They get used quite often and are a great resource.

https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=lefunctionsref&docsetTarg...

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 3973 views
  • 5 likes
  • 4 in conversation