Hi, I have an excel dataset with character variables in the format below:
2001 | 2001-01 | 25 | 2 |
None of them represent dates, but VAR2 is being automatically turned into a date when I import it into SAS. I am just using a proc import statement and trying to convert VAR2 back into its character format with the input statement, but it is not working. The SAS Output looks like this:
SAS OUTPUT:
VAR1 VAR2 VAR3 VAR4
2001 01FEB01:00:00:00 25 2
filename PR "&path.&filein";
PROC IMPORT OUT=work_&i
DATAFILE= "PR"
DBMS=csv REPLACE;
GETNAMES=no;
RUN;
data want_&i;
set work_&i;
_VAR2=input(VAR2,best.);
run;
Any help would be much appreciated! Thank you!
How values are exposed from Excel to SAS depends on rules in various places. Because Excel does not actually have any constraint on variable types SAS Proc Import has to make a lot of guesses depending on actual content. In this case it fails.
Save the data to a CSV file and write a data step to read as needed.
You may also want to open that Excel file and check the display properties for some of those cells. You might find that even though they look like characters to you they are actually dates. Right click on the cell, go to Format Cells. If you see that Custom is showing something like YYYY-MM then it is a date in Excel.
You mention Excel in your subject line, but your posted code is just reading a text file.
To control how SAS reads a text file just write the data step yourself instead of asking PROC IMPORT to make a guess about how you want the file read.
Note that it is more likely that Excel is the one that is converting the string like '2005-01' into a date than SAS. You should never let Excel open a delimited text file without controlling how it interprets the columns.
Hi, thank you both for your help!
What would be the better way of reading in the csv file, instead of the proc import statement?
@kmardinian wrote:
Hi, thank you both for your help!
What would be the better way of reading in the csv file, instead of the proc import statement?
If you read the log after running Proc Import on a csv file you will see generated data step code used to read the file.
Copy from the log and paste into the editor. Clean up the code to remove line numbers and such. Set the informat for the "not a date" variable to a character by using something like $10. Change the Format statement for that variable to match, or delete the format. $ type formats are mostly redundant for character variables.
Run the code.
If you are going to have multiple files to read then save the code. Then you only need to change the INFILE to match the new file and change the data set name to get sets with the same structure.
If the CSV file just has 4 variables then reading it extremely easy.
If you just want to create each variable as character string of maximum length of 10 bytes then use code like this:
data work_&i;
infile "&path.&filein" dsd truncover ;
input (var1-var4) (:$10.);
run;
For a more complicated file I would recommend first defining the variables using a LENGTH statement and then using a simple INPUT statement that just list the variables. If you define the variables in the order they appear in the text file then you can even use a position based variable list. For example if you want to call the four variables YEAR, MONTH, X and Y and you want to treat 3 of them as numbers then your code might look like this.
data work_&i;
infile "&path.&filein" dsd truncover ;
length year 8 month $7 x y 8;
input year -- y;
run;
2001-01 looks like a date value to me. Is it January 2001, or the first quarter of 2001?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.