Is it possible, using PROC IMPORT, to read Excel files and force certain columns to be numeric in the resulting SAS data set; and/or force other columns to be character in the resulting SAS data set? If so, how? If not, are there other mechanisms in SAS that will allow me to achieve this?
Same questions regarding PROC EXPORT, can I force the resulting Excel file to contain text in specified columns, and/or contain character strings in other specified columns?
Thanks,
Paige
Editor's note: this is a popular topic, so we've consolidated several of the most helpful answers in a single reply.
This was posted on SAS-L from @snoopy369. Does it help address your problem?
With the DBMS=EXCEL or EXCELCS options, you can force SAS to read them as text using the DBDSOPTS option. It's something like this:
proc import file="whatever.xls" data=whatever dbms=excel replace;
DBDSOPTS= "DBTYPE=(var1='NUM(8)' var2='CHAR(3)')";
run;
I don't tend to remember the exact combination of ( and = and " for this, so the above is probably slightly off, but you can google it and get the right combination, or when I get into work i'll look at some code I have that does it properly.
Another approach from@Vimal_Kurup:
The one way i am handling this is by setting GETNAMES=NO and STARTROW=1. Since the column names are obviously characters, SAS would import all the data as character attributes. You can later remove the first obs and use the remainder of dataset.
From @Haikuo, an approach that doesn't use PROC IMPORT:
In EG though, you can manually change attributes of each variable when doing import or export.On a side note, here is good SUGI paper on dealing with the Excel files using SAS:
http://www2.sas.com/proceedings/sugi31/020-31.pdf
I doubt that. By using libname engine or pass-through, I think you can only force them all into character by using mixed=yes option, but I am not 100% sure. In EG though, you can manually change attributes of each variable when doing import or export.
Haikuo
On a side note, here is good SUGI paper on dealing with the Excel files using SAS:
http://www2.sas.com/proceedings/sugi31/020-31.pdf
Haikuo
One way to get total control over how an Excel spreadsheet is imported is to save it as a comma or tab-delimited file (CSV), Then you can use PROC IMPORT with an import type of CSV. This generates DATA step code that you can see in the SAS LOG. Copy this code back into the editor and then you can modifiy the INPUT statement to get all columns imported exactly as you want: character or numeric, size, formats labels etc.
Thanks. I am not generating these Excel files, so I do not have the option of creating them or saving them as CSV.
So how will you know which columns need to be text, but actually contain all numbers? Somehow the users have to provide that information, so that you can tell SAS.
I suggest you push back the issue to the users, having them agree to add a row of "data" that contains "Char" for each column that should be character. When SAS sees "Char", it will automatically make that column character.
Good luck.
PartNumber should always be read as character. Units should always be read as numeric, but sometimes it contains alpha. Etc.
The system has been in place for years, and isn't going to change because the statistician has trouble reading it with his software. Other people have developed Excel macros and VB programs and other programs to work with the data as it exists. They're not going to change for me. It is what it is.
I think I should push back on people who think I should push back on the creation of the data files. I think pushing back would be a waste of time.
OK. If "Units" should be numeric, but actually contains alpha, the SAS data set would hold a missing value if you force it to be numeric. Would that form part of an acceptable solution?
OK so you don't produce these Excel files, but why does that stop you creating your own CSV copy of the original spreadsheet using "File Save As"?
Another technique I have had some success with is to insert a "dummy" first data row in a spreadsheet and put characters in columns intended to be character and numbers where they are intended to be numbers.
However both options do require you to manually intervene.
If there was only one such Excel file, I could open it and change it to CSV. This is not the case. There are many Excel files, and I want a program to run unattended, overnight, on a regular schedule. I am not aware of a programmatic way to change Excel files to CSV.
PaigeMiller wrote:
There are many Excel files, and I want a program to run unattended, overnight, on a regular schedule. I am not aware of a programmatic way to change Excel files to CSV.
/*input*/ filename workbook "C:\Documents and Settings\&sysuserid\My Documents\Book1.xls";
/*output*/ filename txtout "C:\Documents and Settings\&sysuserid\My Documents";
data Sheets(keep=indx csvpath);
/* gather info */
length workbook txtout script filevar command $256;
workbook = pathname('WORKBOOK');
txtout = pathname('TXTOUT');
script = catx('\',pathname('WORK'),'SHEETS2TXT.vbs');
filevar = script;
/* write the script */
file dummy1 filevar=filevar;
put 'Const ' workbook=$quote256.;
put 'Const ' txtout=$quote256.;
put 'Const xlCSV = 6';
put 'Set objExcel = CreateObject("Excel.Application")';
*put 'Set objExcel = GetObject(, "Excel.Application")';
put 'With objExcel';
put +3 '.Visible = False';
put +3 '.DisplayAlerts = False';
put +3 'Set objWorkbook = .Workbooks.Open(workbook)';
put +3 'i = 0';
put +3 'Set colSheets = .WorkSheets';
put +3 'For Each objSheet In colSheets';
put +6 'i = i + 1';
put +6 'WScript.echo i & " " & txtout & "\" & objsheet.name & ".csv"';
put +6 'objSheet.SaveAs txtout & "\" & objSheet.name & ".csv", xlCSV';
put +6 'Next';
put +3 '.Application.Quit';
put +3 'End With';
/* close the script file by opening another, not used */
filevar = catx('\',pathname('WORK'),'DUMMY.vbs');
file dummy1 filevar=filevar;
/* look at the script, not necessary but may be useful */
infile dummy2 filevar=script end=eof;
do while(not eof);
input;
putlog _infile_;
end;
/* call the script */
command = catx(' ','cscript',quote(strip(script)),'//nologo');
infile dummy3 pipe filevar=command end=eof truncover;
do while(not eof);
input indx csvPath $256.;
output;
putlog _infile_;
end;
stop;
run;
filename workbook clear;
filename txtout clear;
proc print;
run;
Thanks, data _null_; as always you have a great answer.
I'm sure your solution works, however I think I will stick to a language that I am familiar with, and write a macro in SAS that converts certain variables from char to num, and other variables from num to char. A macro based on your code, which uses Excel macro code (I assume) means that I have a language that I cannot debug or enhance if needed, nor could I explain it to others, so this is a less desirable solution.
This is an opportunity to learn something new.
Have you considered the excel libname engine. I believe there are data set options that you can use to specify how an excel column should be interpreted by SAS.
Yes, I looked at the help for the Excel Libname option, but I didn't see anything that clearly let me do what I wanted. (That doesn't mean the features don't exist, they might be there, I just didn't see anything that looked like what I wanted in the help files)
when you run proc setinit do you see this.
---SAS/ACCESS Interface to PC Files
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.