Good afternoon everyone! I have an excel file with 150 sheets/tabs. I want to import all the data into SAS under one dataset. I only need 6 columns (A through F) from all sheets and they are all named the same. I tried playing around with the following code but its not giving me much result.
%macro pim(sheet);
proc import out= payment
datafile = 'E:\SAS DOC\project\Credit Banking.xls'
dbms = Excel;
sheet = "&sheet";
getnames = yes;
run;
Thanks so much in advance!
Thanks guys, but after some digging I realized its much easier to solve this issue at excel level. In case anyone is interested, I used the following macro in excel to combine all sheets into one combined sheet.
Sub Combine()
Dim J As Integer
On Error Resume Next
Sheets(1).Select
Worksheets.Add
Sheets(1).Name = "Combined"
Sheets(2).Activate
Range("A1").EntireRow.Select
Selection.Copy Destination:=Sheets(1).Range("A1")
For J = 2 To Sheets.Count
Sheets(J).Activate
Range("A1").Select
Selection.CurrentRegion.Select
Selection.Offset(1, 0).Resize(Selection.Rows.Count - 1).Select
Selection.Copy Destination:=Sheets(1).Range("A65536").End(xlUp)(2)
Next
End Sub
Can you convert your file to xlsx? Or are you stuck with an xls file?
Can you set up a libname to your excel file? If so, you can use something like the following to move the sheets over.
*This code will create a libname reference to an Excel file and list all Sheets in the results window;
libname sample excel 'C:\Temp\Sample_v2.xls'; *your statement will depend on your OS/Excel version;
proc contents data=sample._all_;
run;
proc datasets lib=work nodetails nolist;
copy in=sample out=work;
run;quit;
libname sample;
Thanks. I have a xlsx file. So i should just run the code you gave to get what I want? I mean i dont see any conditons regarding getting only 6 columns from each of 150 sheets-A through F. Also, what if the column names are not the same in some sheets-meaning a little spelling difference or extra space?
PROC IMPORT doesn't provide a way to modify those, except as data step options. Because Excel doesn't restrict types in columns there's no definite way a field will be read. It's a guessing procedure.
You're best off just appending all the data and renaming afterwards, and yes the difference in names will through things off.
You're guaranteed to have issues with types so you'll have to manage that as well.
Hi Reeza, just following up-
I managed to have only first 6 columns in each sheet now. The column names are mostly the same except may be 5-10 sheets out of 150 sheets have column names little different. Unfortunately there is no way of knowing that. That surely throws off the code and I get so many errors. What if I tell SAS the exact column names myself, define them and have sas only import from second row onwards?
Just to summarize again what I want. I have an xlsx file with 155 sheets. Each sheet has 6 columns and some data (character or date) below these columns. Column names are mostly the same, but few sheets have column names bit different (additional character or space). I want a SAS dataset where all the data is appended below these 6 columns.
Thanks so much!
In Excel you can't control what gets imported at that level.
You can use GETNAMES=NO + DATAROW/FIRSTROW in your PROC IMPORT and then everything gets imported with default names. Those should be the same across all sheets. And then you can append them all, and RENAME afterwards.
Thanks Reeza. So, just to be clear, what should the code look like?
I modified this macro like this-
%macro pim(sheet);
proc import out= payment
datafile = 'O:\dev\test.xlsx'
dbms = Excel;
sheet = "&sheet";
getnames = NO + DATAROW/FIRSTROW ;
run;
%macro pim(sheet);
What should the code after this be?
No...DATAROW/FIRSTROW are options. Check the documentation please.
You could also write some code using SASHELP.VCOLUMN to dynamically name all variables are the same.
Importing all tabs/sheets from single excel file into one single SAS dataset
If you do not have named ranges, which is the default when you create
a sheet with the libname engine, you can use "&sheet"n where sheet =class0$
https://goo.gl/57cALI
https://communities.sas.com/t5/SAS-Procedures/Importing-all-tabs-sheets-from-single-excel-file-into-one-single/m-p/347878
HAVE a excel workbook with three sheets (three splits od sashelp.class)
d:/xls/tree.xlsx
CLASS0 sheet
+-------------------------+------+-----------+------------
| | A | B | C | D | E |
+------+-----------+------+------+-----------+-----------+
| | | | | | |
| 1 | NAME | AGE| SEX | HEIGHT | WEIGHT |
| 2 | Alfred | 14 | M | 55 | 96 |
| 3 | Alice | 13 | F | 44 | 87 |
| ... | ... | ... | ... | .. | ... |
+------------------+------+------+-----------+-----------+
[CLASS0]
....
CLASS2 sheet
+-------------------------+------+-----------+------------
| | A | B | C | D | E |
+------+-----------+------+------+-----------+-----------+
| | | | | | |
| 1 | NAME | AGE| SEX | HEIGHT | WEIGHT |
| 2 | Ronald | 14 | M | 55 | 96 |
| 3 | William | 13 | F | 44 | 87 |
| ... | ... | ... | ... | .. | ... |
+------------------+------+------+-----------+-----------+
[CLASS2]
WANT (one SAS dataset)
====
Up to 40 obs from allsheets total obs=19
Obs NAME SEX AGE HEIGHT WEIGHT
1 Barbara F 13 65.3 98.0
2 James M 12 57.3 83.0
3 Jeffrey M 13 62.5 84.0
4 Judy F 14 64.3 90.0
5 Philip M 16 72.0 150.0
6 Thomas M 11 57.5 85.0
7 Alfred M 14 69.0 112.5
8 Carol F 14 62.8 102.5
9 Jane F 12 59.8 84.5
10 John M 12 59.0 99.5
11 Louise F 12 56.3 77.0
12 Robert M 12 64.8 128.0
13 William M 15 66.5 112.0
14 Alice F 13 56.5 84.0
15 Henry M 14 63.5 102.5
16 Janet F 15 62.5 112.5
17 Joyce F 11 51.3 50.5
18 Mary F 15 66.5 112.0
19 Ronald M 15 67.0 133.0
* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|
;
* create a excel workbook with three sheets
* just in case you rerun;
%symdel sheets sheet / nowarn;
proc datasets library=work kill;
run;quit;
%utlfkil(d:/xls/three.xlsx);
libname xel "d:/xls/three.xlsx";
data xel.class0 xel.class1 xel.class2;
set sashelp.class;
select (mod(_n_,3));
when (0) output xel.class0;
when (1) output xel.class1;
when (2) output xel.class2;
*leave off otherise to force error in not inclusive;
end;
run;quit;
libname xel clear;
/*
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set XEL.class0 has 6 observations and 5 variables.
NOTE: The data set XEL.class1 has 7 observations and 5 variables.
NOTE: The data set XEL.class2 has 6 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
*/
* _ _ _
___ ___ | |_ _| |_(_) ___ _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|
;
libname xel "d:/xls/three.xlsx";
data _null_;
if _n_=0 then do;
rc=%sysfunc(dosubl('
proc sql;
select
quote(trim(memname))
into
:sheets separated by ","
from
sashelp.vtable
where
libname="XEL"
and index(memname,"$")=0
;quit;
'));
end;
x=resolve('&sheets');
put x=;
length sheet $32;
do sheet=&sheets.;
call symputx('sheet',sheet);
rc=dosubl('
/* handles length issues better than proc append */
data allsheets;
set xel.&sheet;
run;quit;
');
end;
run;quit;
SYMBOLGEN: Macro variable SHEETS resolves to "class0","class1","class2"
X="class0","class1","class2"
SYMBOLGEN: Macro variable SHEET resolves to class0
NOTE: There were 7 observations read from the data set XEL.class0.
NOTE: The data set WORK.ALLSHEETS has 7 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 3663.53k
OS Memory 17644.00k
Timestamp 04/06/2017 03:42:39 PM
Step Count 177 Switch Count 0
SYMBOLGEN: Macro variable SHEET resolves to class1
NOTE: There were 6 observations read from the data set XEL.class1.
NOTE: The data set WORK.ALLSHEETS has 6 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 3663.53k
OS Memory 17644.00k
Timestamp 04/06/2017 03:42:39 PM
Step Count 177 Switch Count 0
SYMBOLGEN: Macro variable SHEET resolves to class2
NOTE: There were 6 observations read from the data set XEL.class2.
NOTE: The data set WORK.ALLSHEETS has 6 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
user cpu time 0.00 seconds
system cpu time 0.01 seconds
memory 3663.53k
OS Memory 17644.00k
Timestamp 04/06/2017 03:42:39 PM
Step Count 177 Switch Count 0
So you have a large XLSX file with multiple sheets?
You can point a libref at the file and use PROC CONTENTS (or other tools) to get the names of the datasets and variables.
libname mydata xlsx 'mydata.xlsx';
proc contents data=mydata._all_ noprint out=contents(where=(varnum <= 6)); run;
proc sort;
by memname varnum ;
run;
Do you know what is supposed to be in the first 6 columns?
Perhaps you can just first make a skeleton dataset with the right format.
data want ;
attrib var1 length=$30 label='Name';
attrib var2 length=8 format=yymmdd10. label='Date';
....
stop;
run;
Then you could use you contents dataset to write code to append the datasets together.
Perhaps using PROC SQL INSERT statement?
filename code temp;
data _null_;
set contents;
by memname ;
file code ;
if first.memname then put
'insert into want select ' name @
;
else put ',' name @ ;
if last.memname then put 'from ' libname +(-1) '.' memname ';' ;
run;
proc sql;
%include code / source2;
quit;
Thanks guys, but after some digging I realized its much easier to solve this issue at excel level. In case anyone is interested, I used the following macro in excel to combine all sheets into one combined sheet.
Sub Combine()
Dim J As Integer
On Error Resume Next
Sheets(1).Select
Worksheets.Add
Sheets(1).Name = "Combined"
Sheets(2).Activate
Range("A1").EntireRow.Select
Selection.Copy Destination:=Sheets(1).Range("A1")
For J = 2 To Sheets.Count
Sheets(J).Activate
Range("A1").Select
Selection.CurrentRegion.Select
Selection.Offset(1, 0).Resize(Selection.Rows.Count - 1).Select
Selection.Copy Destination:=Sheets(1).Range("A65536").End(xlUp)(2)
Next
End Sub
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.