BookmarkSubscribeRSS Feed
rfarmenta
Obsidian | Level 7

I received a dataset that has 1000 variables and all of them are character. Most of the variables should be numeric but the program that creates the dataset can't change the type. Is there a way to have SAS automatically convert the variables that only contains integers to numeric and flag everything else to manually go through to make a decision on conversion? Some variables that should only be numeric are open text fields and often people put in character values when they should not so those would have to be flagged so remove the character values and move forward. There are also several varaiables that should be kept as character because they are open text. Any help would be appreciated. 

6 REPLIES 6
Astounding
PROC Star

This is actually a few questions rolled up into one.  I will try to answer one of them, to give you a place to start.

 

data investigate;

set have;

array chars {*} _character_;

do _n_=1 to dim(chars);

   if chars{_n_} > ' ' and input(chars{_n_}, ?? 16.) = . then do;

      varname = vname(chars{_n_});

      varvalue = chars{_n_};

      output;

   end;

end;

keep varname varvalue;

run;

 

This gives you a list of all values that need to be investigated.  The variable contains text, but the text cannot be converted to a number. 

 

You may want to add to the KEEP statement to output a few more variables.

 

Good luck.

rfarmenta
Obsidian | Level 7

Thank you. I ran that code and it gives me 435,300 values that need to be investigated. We have over 91,000 observations in the dataset. I am going to play around with this a little more to see what I can come up with. 

ballardw
Super User

What file type was the original data in and how was it brough into SAS?

 

When you expect that many variables to be numeric but were brought in as character one suspects use of Proc Import and that there may have been something odd, like blank rows or other characters at the top of the file.

 

If the source file was any type of text file then writing an accurate data step to read the file may be a better way to go if this project will have to read multiple files.

rfarmenta
Obsidian | Level 7

The data is being read in from an SQL server using a odbc link to commect to the SQL server. Unfortunately the data is actually purposely all stored as character so they are actually being read in correctly. We talked with the programmer to get this fixed but he said it was a long project and it would be ideal if we could convert the variables in SAS. The data comes from a few sources and it is easier to store them as character initially (according to the programmer) so data is not lost. 

 

 

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, to be honest, if it was I would be pushing back for them to fix it their end.  At the end of the day whatever you do will be guessing.  As for your data, can they not export the data as CSV.  You could then write a datastep import program and setup the formats on the import data -> this requires specifications.  I presume the data vendor has these, database specifications at least, which detila what data can be entered.  Once you have them then you can program your import accordingly.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, I think EG or one of the other SAS products has a summary which gives a count for each WARNING, ERROR and NOTE.  I have developed a couple of these in both SAS and C#, it is a simple load text file and scan for elements you want to scan.  I will not provide you code as that is proprietary, but a couple of pointers.

data ...; 

  infile "<your_log>"...;

  input buffer $;

run;

 

data ...;

  set <dataset above>;

  if index(buffer,"ERROR") > 0 then...

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 878 views
  • 0 likes
  • 4 in conversation