BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rsuresh
Calcite | Level 5

I have a large excel file that I am importing to SAS; Row 1 has the variable names but rows 2-4 of the file have additional descriptive information (about the variables) that I do not want in my SAS dataset. Is there a way to exclude those 3 rows while importing or after importing the data onto SAS?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@rsuresh wrote:

I have a large excel file that I am importing to SAS; Row 1 has the variable names but rows 2-4 of the file have additional descriptive information (about the variables) that I do not want in my SAS dataset. Is there a way to exclude those 3 rows while importing or after importing the data onto SAS?


You are in luck.  The quirky way that the DATAROW= statement works in PROC IMPORT with the XLSX db engine will do exactly that.  You can tell it what row to start reading the data, but it always uses the first row to guess what names to give to the variables.  

proc import datafile='large.xlsx' dbms=xlsx
  out=want replace
;
  datarow=5;
run;

If the extra information was in rows 1 to 3 and row 4 had the column headers then you would have to work harder.

Example program:

proc import datafile='c:\downloads\skip.xlsx' dbms=xlsx
  out=skip replace
;
  datarow=3;
run;

Result:

Obs    X             Y    Z

 1     1    12/31/2021    Height

Picture of XLSX file:

Screenshot 2022-01-07 215026.jpg

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

Best thing is to remove these rows in Excel.

--
Paige Miller
ballardw
Super User

If you are going to have multiple files with the same structure then one approach is to save the file to a CSV file format and write a data step to read the file.

The Infile statement that is used by the data step to identify the file and characteristics such as delimiters and logical record length include a FIRSTOBS option which means skip to the row of the file before attempting to read anything.

 

One big reason to use the CSV approach with multiple files is so you control the type and length of all the variables. Proc Import, or any wizard that uses such, makes separate decisions for each file imported. Which means that variables of the same name may change type or length from different files. Or in a poor data source environment may even mean that the variable names change. The code to read CSV would treat the columns in order the same for each file.

 

If you save your file to CSV and use Proc Import for that file (recommend the guessingrows=max; option) then SAS will create data step code that appears in the log. You can copy that code to the editor, clean it up save and reuse.
This will work better with a file without extra information as you describe. Once you have a nice program to read the file you can add the option for Firstobs to skip the header rows for the next file.

rsuresh
Calcite | Level 5
Unfortunately, I have multiple files and most of them have varying formats for themselves and the sheets within as well.
Tom
Super User Tom
Super User

@rsuresh wrote:

I have a large excel file that I am importing to SAS; Row 1 has the variable names but rows 2-4 of the file have additional descriptive information (about the variables) that I do not want in my SAS dataset. Is there a way to exclude those 3 rows while importing or after importing the data onto SAS?


You are in luck.  The quirky way that the DATAROW= statement works in PROC IMPORT with the XLSX db engine will do exactly that.  You can tell it what row to start reading the data, but it always uses the first row to guess what names to give to the variables.  

proc import datafile='large.xlsx' dbms=xlsx
  out=want replace
;
  datarow=5;
run;

If the extra information was in rows 1 to 3 and row 4 had the column headers then you would have to work harder.

Example program:

proc import datafile='c:\downloads\skip.xlsx' dbms=xlsx
  out=skip replace
;
  datarow=3;
run;

Result:

Obs    X             Y    Z

 1     1    12/31/2021    Height

Picture of XLSX file:

Screenshot 2022-01-07 215026.jpg

rsuresh
Calcite | Level 5
This worked perfectly, thank you so much!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 4384 views
  • 1 like
  • 4 in conversation