DATA Step, Macro, Functions and more

converting all character vars (which should be numeric) to numeric

Reply
Super Contributor
Posts: 441

converting all character vars (which should be numeric) to numeric

Hi,

 

I have an Excel file with many variables and when I import it to SAS they are all of a character format. But among these variables there are many variables whose observartions are all numbers, but still their format is character.

 

SAS gives an example on how to convert all character variables to numeric

http://support.sas.com/kb/40/700.html

proc contents data=test out=vars(keep=name type) noprint; 
 run;

data vars;                                                
set vars;                                                 
if type=2 ;                               
newname=trim(left(name))||"_n";  
run;

proc sql ;                                         
select trim(left(name)), trim(left(newname)),             
       trim(left(newname))||'='||trim(left(name))         
into :c_list separated by ' ', :n_list separated by ' ',  
     :renam_list separated by ' '                         
from vars;                                                
quit; 

 data test2;                                               
set test;                                                
array ch(*) $ &c_list;                                    
array nu(*) &n_list;                                      
do i = 1 to dim(ch);                                      
  nu(i)=input(ch(i),8.);                                  
end;                                                      
drop i &c_list;                                           
rename &renam_list;                                                                                      
run;   

But the above code converts to numeric even variables that should be character.

 

So is it possible for SAS to know automatically which characters should be converted to numeric and which should not?

 

Thank you!

Super User
Posts: 3,250

Re: converting all character vars (which should be numeric) to numeric

[ Edited ]

The simple answer is no as SAS does not know the significance or purpose of the variable. For example an all numeric variable might contain a credit card number. These are typically 16 digits and if you convert that into a SAS numeric variable the last couple of digit cannot be stored accurately so your credit card numbers get changed. This is quite apart from there being no need to do calculations on such a variable - the main reason you would want to convert.

 

Only you know the purpose of a variable and so can tell SAS which columns should be converted or not.

Super User
Posts: 19,770

Re: converting all character vars (which should be numeric) to numeric

No. SAS can make good guesses. This is what PROC import does, but as you've seen it's not always accurate. 

 

 

 

Super User
Posts: 7,758

Re: converting all character vars (which should be numeric) to numeric

First of all:

DO NOT IMPORT DATA FROM EXCEL FILES!

Save to a reasonable format (.csv) and read that with a data step. That way you have full control over what is read and how.

With Excel files, you force SAS to make guesses; and since those guesses are dynamic, an unexpected change in the data can cause havoc to your analysis, instead of ending up with a failed data step that immediately points you to the problem.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,942

Re: converting all character vars (which should be numeric) to numeric

So you ran a guessing procedure (proc import) on an unstructured/uncontrolled file format, and now you have garbage.   This should come as no surprise.  The term is GIGO (Garbage in Garbage Out).  Fix your import source, starting with the Data Transfer agreement that you will have built in conjunction with the data vendor, most vendors are able to supply data in a usable format - if not stop paying them and they will soon disappear.  You import agreement will define data structure, list of values, frequencies and such like which will make the programming of the import a simple copy paste and add some syntax.

Super User
Posts: 11,343

Re: converting all character vars (which should be numeric) to numeric

1) Save the Excel to CSV

2) Run proc import on that with a large value for guessingrows (the data import wizard will let you set that if you don't want to write code).

3) Take a look at the results. If you don't like it you can recall the data step code generated for proc import and modify INFORMAT statements  to force reading variables as numbers or dates and assign suitable formats.

You could also clean up variable names as columns in Excel with headers like: The value is supposed to be used for abc will have long and difficult to use varaible names on import. Assign meaningful labels for those variables.

With a data step you can also do some initial processing such as unexpected value messages, age calculations from dates, decomposition of multivalued variables, creation of new varaibles.

Super User
Posts: 19,770

Re: converting all character vars (which should be numeric) to numeric

This macro is helpful when saving to CSV, especially if your fields have embedded comma's.

https://support.microsoft.com/en-us/kb/291296

Super Contributor
Posts: 441

Re: converting all character vars (which should be numeric) to numeric

Given that after all SAS can't make the distinction, is it possibe to have a quasi solution by verifying the first say 100 observations of each character variable and if no characters are found then transform this variable into numerical?

 

 

Super User
Posts: 19,770

Re: converting all character vars (which should be numeric) to numeric

That's what GUESSINGROWS does. 

Ask a Question
Discussion stats
  • 8 replies
  • 362 views
  • 9 likes
  • 6 in conversation