DATA Step, Macro, Functions and more

Convert large number character variables to numeric

Reply
Regular Contributor
Posts: 150

Convert large number character variables to numeric

I received a dataset that has 1000 variables and all of them are character. Most of the variables should be numeric but the program that creates the dataset can't change the type. Is there a way to have SAS automatically convert the variables that only contains integers to numeric and flag everything else to manually go through to make a decision on conversion? Some variables that should only be numeric are open text fields and often people put in character values when they should not so those would have to be flagged so remove the character values and move forward. There are also several varaiables that should be kept as character because they are open text. Any help would be appreciated. 

Super User
Posts: 5,516

Re: Convert large number character variables to numeric

Posted in reply to rfarmenta

This is actually a few questions rolled up into one.  I will try to answer one of them, to give you a place to start.

 

data investigate;

set have;

array chars {*} _character_;

do _n_=1 to dim(chars);

   if chars{_n_} > ' ' and input(chars{_n_}, ?? 16.) = . then do;

      varname = vname(chars{_n_});

      varvalue = chars{_n_};

      output;

   end;

end;

keep varname varvalue;

run;

 

This gives you a list of all values that need to be investigated.  The variable contains text, but the text cannot be converted to a number. 

 

You may want to add to the KEEP statement to output a few more variables.

 

Good luck.

Regular Contributor
Posts: 150

Re: Convert large number character variables to numeric

Posted in reply to Astounding

Thank you. I ran that code and it gives me 435,300 values that need to be investigated. We have over 91,000 observations in the dataset. I am going to play around with this a little more to see what I can come up with. 

Super User
Posts: 11,343

Re: Convert large number character variables to numeric

Posted in reply to rfarmenta

What file type was the original data in and how was it brough into SAS?

 

When you expect that many variables to be numeric but were brought in as character one suspects use of Proc Import and that there may have been something odd, like blank rows or other characters at the top of the file.

 

If the source file was any type of text file then writing an accurate data step to read the file may be a better way to go if this project will have to read multiple files.

Regular Contributor
Posts: 150

Re: Convert large number character variables to numeric

The data is being read in from an SQL server using a odbc link to commect to the SQL server. Unfortunately the data is actually purposely all stored as character so they are actually being read in correctly. We talked with the programmer to get this fixed but he said it was a long project and it would be ideal if we could convert the variables in SAS. The data comes from a few sources and it is easier to store them as character initially (according to the programmer) so data is not lost. 

 

 

 

 

Super User
Super User
Posts: 7,989

Re: Convert large number character variables to numeric

Posted in reply to rfarmenta

Well, to be honest, if it was I would be pushing back for them to fix it their end.  At the end of the day whatever you do will be guessing.  As for your data, can they not export the data as CSV.  You could then write a datastep import program and setup the formats on the import data -> this requires specifications.  I presume the data vendor has these, database specifications at least, which detila what data can be entered.  Once you have them then you can program your import accordingly.

Super User
Super User
Posts: 7,989

Re: Convert large number character variables to numeric

Posted in reply to rfarmenta

Well, I think EG or one of the other SAS products has a summary which gives a count for each WARNING, ERROR and NOTE.  I have developed a couple of these in both SAS and C#, it is a simple load text file and scan for elements you want to scan.  I will not provide you code as that is proprietary, but a couple of pointers.

data ...; 

  infile "<your_log>"...;

  input buffer $;

run;

 

data ...;

  set <dataset above>;

  if index(buffer,"ERROR") > 0 then...

run;

Ask a Question
Discussion stats
  • 6 replies
  • 234 views
  • 0 likes
  • 4 in conversation