Hi mate,
This code works as you need:
Use range option to specify the range cells and get the first table and the second, the following example i've made:
TABLE1 | ||||
COL1 | COL2 | COL3 | COL4 | COL5 |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
TABLE2 | ||||
COL1 | COL2 | COL3 | COL4 | COL5 |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
A | B | C | D | E |
PROC IMPORT DATAFILE="/PATH/test.xls"
OUT = UPSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=2;
RANGE="Sheet1$A1:E9";
RUN;
PROC IMPORT DATAFILE="/PATH/test.xls"
OUT = DOWNSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=2;
RANGE="Sheet1$A11:E19";
RUN;
Hope this helps
Sorry, I should have been more clear. The first section, Table1, contains 'n' observations. I am importing from roughly 30 workbooks and n is rarely the same. So, using range= does not help me in this situation.
Ok,
I changed to use macro to get the values and separate the datasets:
PROC IMPORT DATAFILE="/path/test.xls"
OUT = FIRSTSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=3;
RUN;
DATA HAVE;
SET FIRSTSTREAM;
X = _N_;
RUN;
PROC SQL;
SELECT X+2 INTO: FIRSTLINE FROM HAVE WHERE TABLE1 = "";
SELECT MAX(X) INTO: LASTLINE FROM HAVE;
QUIT;
PROC IMPORT DATAFILE="/path/test.xls"
OUT = UPSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=2;
RANGE="Sheet1$A1:E&FIRSTLINE.";
RUN;
PROC IMPORT DATAFILE="/path/test.xls"
OUT = DOWNSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=2;
RANGE="Sheet1$A&FIRSTLINE.:E&LASTLINE.";
RUN;
Could you post one of your Excel files? The proper approach depends a lot on data structure and layout.
Excellent point Mr Respected Advisor! I have a dummy workbook and I will place an image here as well.
Which shows my example nicely. What you have posted is a best case scenario, all the columns are text, and the same structure. Is your data really like that, I wouldn't think so. So what will you do if:
- col1 is missing for any reason within the data?
- numbers where previous table is characters
- different numbers of columns
These are a few things that spring to mind, not to mention hidded data, special characters, no fixed setup, so could change at further attempts etc. Why are you importing this data in the first place, why not access the Raw data this was produced from? I mentioned this in another topic, but one of the big reasons to avoid Excel as an output format is the general misconception that it is a data transfer format and can be used as such (unlike Word/PDF), hence we end up trying to read in output review files, which is what you are trying to do here, rather than go through a strcutured validated robust approach of:
- create data import specifcation
- get agreement on file format (using proper data transfer file formats = CSV, XML etc.)
- create import program based on specifcations
- run program on test data and validate
- run at each import occasion and validate
PROC IMPORT DATAFILE="/path/test.xls"
OUT = FIRSTSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=3;
RUN;
DATA HAVE;
SET FIRSTSTREAM;
X = _N_;
RUN;
PROC SQL;
SELECT X+2 INTO: FIRSTLINE FROM HAVE WHERE TABLE1 = "";
SELECT MAX(X) INTO: LASTLINE FROM HAVE;
QUIT;
PROC IMPORT DATAFILE="/path/test.xls"
OUT = UPSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=2;
RANGE="Sheet1$A1:E&FIRSTLINE.";
RUN;
PROC IMPORT DATAFILE="/path/test.xls"
OUT = DOWNSTREAM
DBMS=XLS REPLACE;
SHEET="Sheet1";
GETNAMES=YES;
DATAROW=2;
RANGE="Sheet1$A&FIRSTLINE.:E&LASTLINE.";
RUN;
This code do exactly what you need, and you don't need to specify which one is the line to breakup the tables.
You only have to change the where clause in proc sql.
Hope this helps
I wouldn't import it as is. The reason is that you will have text (downstrem) in a numeric field. Its messy at best, and assuming a missing value is the break could be misleading. Its another example of Excel being a problem when dealing with real data. I would suggest you break the table yourself into two sheets, and preferentially save each sheet as CSV and write an import program for each. This avoids the numerous issues when using Excel .
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.