I know how to use the code below to convert one variable at a time to numeric from character but can someone please tell me how to convert multiple variables at once from character to numeric or vice versa. Thank you in advance!
data BIN;
set Cleaned_Data (rename=(ID=old));
ID = input (old, 6.);
drop old;
run;
@Yughaber wrote:
so my problem currently is that I have a number of excel files with dates and times. When it comes time to match and merge these separate files by ID numbers, I get an error that in some files these dates are defined as character and other files as numeric. This is what I was trying to change because going back to the excel file the cell format is all the same for everything as a date and time.
So your actual underlying issue is Excel file format. Color me not surprised, that is probably the single most frequent problem asked about on this forum.
If you rely on Proc Import or any wizard to bring the data into SAS then every file is processed differently and you have little control.
If the files are supposed to have the same content you can make things more consistent by 1) saving the files to CSV from Excel and write a data step to read the CSV files consistently, just changing the input file name and output SAS data set name.
Proc Import of a CSV can help you there but it is a good idea to use a set many records and the option GUESSINGROWS=MAX if your write code or specify a large value in any wizard for rows to examine.
Proc import will create data step code used to read the CSV file. Copy from the LOG into the code editor, clean up such as removing line numbers. Examine the properties which are set in the INFORMATS for variables. If you don't like the variable names you can do search and replace -carefully- to create different variable names. If values such as character values seem likely to be longer than you see you can make the character informats longer. Typically I set them 20 to 25% longer unless there is a document that indications just how many characters a value should have.
Test the code.
Data steps have several advantages. One is you can specify the FIRSTOBS if the data actually starts on a row other than the second. Another is specifying the exact informat to read data, which with some practice is a very powerful tool for validating values; making consistent values, or just plain writing custom messages to the log such as variables with missing values that are not supposed ever be missing.
Then you can read other files by changing the INFILE statement to point to a different file and change the SAS data set name to match as desired.
I do this to literally dozens of files every month because Excel files cannot be trusted. I receive Excel files from a contractor that gets paid thousands of dollars but will have account identifiers formatted as dates or currency amounts, has dates formatted as Currency, and guess what- currency formatted as dates. So my code throws errors when certain of these occur.
And depending on which file the first line of actual data could be row 6, or 9, or 7 or 8 or ... So I can specify where the data starts regardless of the stupid changing number of header rows.
I might suggest going back to the step where the data is created or read into SAS and ensuring the data is created as needed initially. Prevention is usually less work then curing bad data.
Once the variable is created there really isn't any other approach than either an input or put to a new variable.
without knowing the specifics of the needed data that's about it. You should know the specific format needed with a put function to create a character variable we can't. Similar to create a numeric from character you would know the current values and have a chance of specifying the informat to use with the input function.
@Yughaber wrote:
so my problem currently is that I have a number of excel files with dates and times. When it comes time to match and merge these separate files by ID numbers, I get an error that in some files these dates are defined as character and other files as numeric. This is what I was trying to change because going back to the excel file the cell format is all the same for everything as a date and time.
So your actual underlying issue is Excel file format. Color me not surprised, that is probably the single most frequent problem asked about on this forum.
If you rely on Proc Import or any wizard to bring the data into SAS then every file is processed differently and you have little control.
If the files are supposed to have the same content you can make things more consistent by 1) saving the files to CSV from Excel and write a data step to read the CSV files consistently, just changing the input file name and output SAS data set name.
Proc Import of a CSV can help you there but it is a good idea to use a set many records and the option GUESSINGROWS=MAX if your write code or specify a large value in any wizard for rows to examine.
Proc import will create data step code used to read the CSV file. Copy from the LOG into the code editor, clean up such as removing line numbers. Examine the properties which are set in the INFORMATS for variables. If you don't like the variable names you can do search and replace -carefully- to create different variable names. If values such as character values seem likely to be longer than you see you can make the character informats longer. Typically I set them 20 to 25% longer unless there is a document that indications just how many characters a value should have.
Test the code.
Data steps have several advantages. One is you can specify the FIRSTOBS if the data actually starts on a row other than the second. Another is specifying the exact informat to read data, which with some practice is a very powerful tool for validating values; making consistent values, or just plain writing custom messages to the log such as variables with missing values that are not supposed ever be missing.
Then you can read other files by changing the INFILE statement to point to a different file and change the SAS data set name to match as desired.
I do this to literally dozens of files every month because Excel files cannot be trusted. I receive Excel files from a contractor that gets paid thousands of dollars but will have account identifiers formatted as dates or currency amounts, has dates formatted as Currency, and guess what- currency formatted as dates. So my code throws errors when certain of these occur.
And depending on which file the first line of actual data could be row 6, or 9, or 7 or 8 or ... So I can specify where the data starts regardless of the stupid changing number of header rows.
@Yughaber wrote:
Thank you so much for your detailed explanation and guidance. I really appreciate your time you spent to help me.
As an experienced SAS user, how do you think I can gain more knowledge to be able to practice with this specific step. I feel like there is a lot that I am unaware of when working with CSV files that sometimes I get confused with what I find in my google searches. Is there like any specific place I can learn from that you would recommend? Thank you.
When dealing with SAS the best source is often the SAS documentation. Bookmark https://documentation.sas.com or visit and spend some time searching through the pieces you want, or if you have online links available in your SAS session. The DATA step programming is the core bit for data manipulation and the Base SAS Procedures will have most of what you need to get started. Most of the procedures have one or more complete examples of what the code should be for some common situations and often provides data or an example file to read as data. Often the examples are provided as code and then broken down by the key steps. Not every option is used in every example but you should find likely hints in the lists of options for each statement.
I learned SAS with paper manuals and the indexes in the manuals and the advent of online searchable help files makes things much nicer.
You can also search this forum with key words and likely find an example or ten of similar problems. There are literally 1000's of acknowledged answered questions on the forum and many that the original poster did not mark as solved.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.