BookmarkSubscribeRSS Feed
SASdevAnneMarie
Barite | Level 11

Hello,

Maybe someone could explain me the proc import comportement in encoding the values:

Exemple: I have the value 6.66% in my excel file,

proc import translate this value in character value 6.66666 

When I'm forcing by adding the 0 to this values I have the correct values in table: 6.66+0 =0.066.

 

I can explain the transformation: 6.66666 (character) +0 =0.06666 (numeric)

 

Thank you

 

11 REPLIES 11
Tom
Super User Tom
Super User

Please post a small example that demonstrates the issue.  Do you have an XLSX file or the older XLS format?

Note that Excel is a spreadsheet where every cell can have a different type of value, but SAS datasets require that each variable has only one type for all observations.  Make sure that there are no character strings in other cells in the column that has the 6.6% value.  When the column has mixed data then SAS must create a character variable.  It is possible to represent numbers using character strings, but trying to represent strings as numbers is not really possible.

SASdevAnneMarie
Barite | Level 11

proc import
datafile= "xxx\Exemple.xlsx"
out=Retours
replace dbms=xlsx;
getnames=yes;

run;

data Retours1;
set Retours;
Exemple2=(Exemple+0);
Exemple3=compress(put(round(Exemple2*100,0.01),10.2)||'%');
if Exemple2^=. then Exemple=Exemple3;

run;

SASdevAnneMarie
Barite | Level 11

Thank you!

 

proc import

     datafile= "xxxx\Exemple.xlsx"

     out=Retours

     replace dbms=xlsx;

     getnames=yes;

 

run;

 

data Retours1;

     set Retours;

     Exemple2=(Exemple+0);

     Exemple3=compress(put(round(Exemple2*100,0.01),10.2)||'%');

     if Exemple2^=. then Exemple=Exemple3;

 

run;

Kurt_Bremser
Super User

I used the usual "get rid of Excel woes" method:

I saved the file from Excel to a csv and copied it unto our server, then ran this code:

data test;
infile '$HOME/sascommunity/Exemple.csv' truncover firstobs=2;
input stringval $20.;
stringval = translate(stringval,'.',',');
if stringval not in ('','NC*') then numval = input(stringval,percent20.);
format numval nlpct7.2;
run;

proc print data=test noobs;
run;

Partial result:

stringval     numval

 6.66%         6,66%
 0.67%         0,67%
 -6.00%       -6,00%
 7.2311%       7,23%
 8.80%         8,80%
               .    
 2.60%         2,60%
 1.20%         1,20%
 NC*           .    
 0.80%         0,80%
 NC*           .    
 NC*           .    
 0.34%         0,34%
 0.97%         0,97%
 0.80%         0,80%
 0.80%         0,80%
 NC*           .    
 0.65%         0,65%
 NC*           .    
 NC*           .    
 0.64%         0,64%
 NC*           .    
 NC*           .    
 NC*           .    
 NC*           .    
 1.00%         1,00%
 NC*           .    
 0.72%         0,72%
 1.00%         1,00%
 0.52%         0,52%
 0.83%         0,83%
 0.52%         0,52%
 0.98%         0,98%
 NC*           .    
 NC*           .    
 NC*           .    
 NC*           .    
 0.65%         0,65%
 0.04%         0,04%
 NC*           .    
 NC*           .    
 0.50%         0,50%
 NC*           .    
 0.56%         0,56%
 NC*           .    
 0.69%         0,69%
 0.60%         0,60%
 0.60%         0,60%
 0.60%         0,60%
 0.75%         0,75%
 0.75%         0,75%
 NC*           .    
 1.15%         1,15%
 NC*           .    
 NC*           .    

Bottom line: Excel files are useless for data transfer, textuality rules, as it allows you to easily make corrections.

ballardw
Super User

Go back to your Excel file. Examine VERY closely each value in the first 20 rows for the column that is giving you problems. By default SAS uses the values of the first 20 rows to set the properties of variables. Values like NA, NULL or anything not numeric in those rows means SAS usually assign a character property and then use the values to set how long.

 

Just because it looks numeric does not mean the cell has an actual numeric value, especially if the data was manually entered. The underlying issue is that Excel, and most spreadsheets, to not have an actual concept of datatype for a column. The same column can contain text, numeric values, dates, times. I won't even go into people that use cell colors to indicate actual values. So processes have to make some choice what to do with the column. With only 20 rows examined then the guessing algorithm may be working with a very incomplete set of data.

 

The better way to convert character to numeric is to use the Input function:  newvar = input(charvar,informat.); That will prevent getting messages in the Log like "character values have been converted to numeric at" . Which is sometimes an indication of not quite right code, especially if you did not expect a variable to be character.

SASdevAnneMarie
Barite | Level 11
Hello,
Thank you for your answer. Yes, I have a different types of values in the Excel column because different people fill the column.
I would like just to hold the character value and CORRECT value (that are numeric in Excel) that appear like 6.6666 after the proc import, therefor I did 6.666+0=0.066, after I'll convert it on 0.066 character value and replace 6.666
I can explain how sas transforme: 6.66666 (character) +0 =0.06666 (numeric)
Tom
Super User Tom
Super User
To convert a string to a number use the INPUT() function with a numeric INFORMAT.
num = input(string,32.);
To convert from 6 to 0.06 divide by 100.
Patrick
Opal | Level 21

@ballardw 

"By default SAS uses the values of the first 20 rows to set the properties of variables."

I believe with xlsX all rows get analyzed. I've just run a test using an Excel sheet where in column A the first 10000 rows had a value of 1 (numeric) and row 10001 a value of 'a'. The resulting SAS variable became character with a length of 1.

 

Patrick
Opal | Level 21

Using Proc Import with SAS9.4M5 under Rhel I can replicate what you describe. Here what I'm getting as a result (in a character column)

Capture.JPG

This looks like SAS reads at least for some of the cells the "raw" values from the sheet1.xml which doesn't appear to be the correct behaviour (even though your Excel data is especially ugly).

    <row r="2" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A2" s="3">
        <v>6.6600000000000006E-2</v>
      </c>
    </row>
    <row r="3" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A3" s="3">
        <v>6.6600000000000001E-3</v>
      </c>
    </row>
    <row r="4" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A4" s="3">
        <v>-0.06</v>
      </c>
    </row>
    <row r="5" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A5" s="4">
        <v>7.2311110999999997E-2</v>
      </c>
    </row>
    <row r="6" spans="1:1" x14ac:dyDescent="0.25">
      <c r="A6" s="3" t="s">
        <v>27</v>
      </c>
    </row>

If I use the SAS EG8.2 import wizard then the result looks better. Still strings of course but at least wysiwig and something you can post-process without having to know how Excel stores values internally.

Capture.JPG

What Proc Import does to your Excel data is may-be something you should raise with SAS Technical Support.

 

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 5925 views
  • 0 likes
  • 5 in conversation