Solved: Re: Ignoring specific rows in SAS

rsuresh · Posted 01-07-2022 04:32 PM

I have a large excel file that I am importing to SAS; Row 1 has the variable names but rows 2-4 of the file have additional descriptive information (about the variables) that I do not want in my SAS dataset. Is there a way to exclude those 3 rows while importing or after importing the data onto SAS?

Tom · Posted 01-07-2022 09:46 PM

@rsuresh wrote:

I have a large excel file that I am importing to SAS; Row 1 has the variable names but rows 2-4 of the file have additional descriptive information (about the variables) that I do not want in my SAS dataset. Is there a way to exclude those 3 rows while importing or after importing the data onto SAS?

You are in luck. The quirky way that the DATAROW= statement works in PROC IMPORT with the XLSX db engine will do exactly that. You can tell it what row to start reading the data, but it always uses the first row to guess what names to give to the variables.

proc import datafile='large.xlsx' dbms=xlsx
  out=want replace
;
  datarow=5;
run;

If the extra information was in rows 1 to 3 and row 4 had the column headers then you would have to work harder.

Example program:

proc import datafile='c:\downloads\skip.xlsx' dbms=xlsx
  out=skip replace
;
  datarow=3;
run;

Result:

Obs    X             Y    Z

 1     1    12/31/2021    Height

Picture of XLSX file:

View solution in original post

PaigeMiller · Posted 01-07-2022 04:33 PM

Best thing is to remove these rows in Excel.

--
Paige Miller

ballardw · Posted 01-07-2022 05:53 PM

If you are going to have multiple files with the same structure then one approach is to save the file to a CSV file format and write a data step to read the file.

The Infile statement that is used by the data step to identify the file and characteristics such as delimiters and logical record length include a FIRSTOBS option which means skip to the row of the file before attempting to read anything.

One big reason to use the CSV approach with multiple files is so you control the type and length of all the variables. Proc Import, or any wizard that uses such, makes separate decisions for each file imported. Which means that variables of the same name may change type or length from different files. Or in a poor data source environment may even mean that the variable names change. The code to read CSV would treat the columns in order the same for each file.

If you save your file to CSV and use Proc Import for that file (recommend the guessingrows=max; option) then SAS will create data step code that appears in the log. You can copy that code to the editor, clean it up save and reuse.
This will work better with a file without extra information as you describe. Once you have a nice program to read the file you can add the option for Firstobs to skip the header rows for the next file.

rsuresh · Posted 01-10-2022 10:19 PM

Unfortunately, I have multiple files and most of them have varying formats for themselves and the sheets within as well.

Tom · Posted 01-07-2022 09:46 PM

@rsuresh wrote:

I have a large excel file that I am importing to SAS; Row 1 has the variable names but rows 2-4 of the file have additional descriptive information (about the variables) that I do not want in my SAS dataset. Is there a way to exclude those 3 rows while importing or after importing the data onto SAS?

You are in luck. The quirky way that the DATAROW= statement works in PROC IMPORT with the XLSX db engine will do exactly that. You can tell it what row to start reading the data, but it always uses the first row to guess what names to give to the variables.

proc import datafile='large.xlsx' dbms=xlsx
  out=want replace
;
  datarow=5;
run;

If the extra information was in rows 1 to 3 and row 4 had the column headers then you would have to work harder.

Example program:

proc import datafile='c:\downloads\skip.xlsx' dbms=xlsx
  out=skip replace
;
  datarow=3;
run;

Result:

Obs    X             Y    Z

 1     1    12/31/2021    Height

Picture of XLSX file:

rsuresh · Posted 01-10-2022 10:10 PM

This worked perfectly, thank you so much!

Ignoring specific rows in SAS

Re: Ignoring specific rows in SAS

Re: Ignoring specific rows in SAS

Re: Ignoring specific rows in SAS

Re: Ignoring specific rows in SAS

Re: Ignoring specific rows in SAS

Re: Ignoring specific rows in SAS

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away