DATA Step, Macro, Functions and more

Importing a .xlsx file into SAS with header in 2 row and data in 4 row

Reply
Occasional Contributor
Posts: 17

Importing a .xlsx file into SAS with header in 2 row and data in 4 row

How can i import excel having .XLSX Extension  using proc import where header start in 2 row and data start in 4 row.I know keywords namerow and startrow used in xls but same is not working in xlsx.i dont want to use range option because i have diffrent excel and want to make code generic.

Trusted Advisor
Posts: 1,610

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

Gosh, I hate Excel. I understand why the documentation for R strongly discourages the use of Excel for data import.

 

Anyway, to solve your problem, in Excel, delete rows 1 and 3, and then use SAS PROC IMPORT.

Occasional Contributor
Posts: 17

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

Thank you but i dont want to delete anythng manually . I need a keyword that makes any row as header and data row.For ex-->for xls extension in proc import we use namerow=2 ,datarow=4 to make 2 row as header and 4 row from where data starts.

In my case there are many files and data is jumbled up.so i need keyword that i can use.

Trusted Advisor
Posts: 1,610

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

Sorry to inform you that I am not aware of a simple solution that will do what you ask.

 

Perhaps the good news is that some expert here has a solution and I will learn the solution as well as you.

Frequent Contributor
Posts: 93

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

https://communities.sas.com/t5/SAS-Procedures/variable-names-in-quot-row-2-quot-how-to-let-PROC-IMPO...

 

"- you can use import procedure with GETNAMES=NO, DATAROW=3, GUESSINGROWS= (default 20 - you may increase it depending on your file size) and after importing it use a data step or proc datasets to change the variables name."

 

So, to sum:

  1. Create two sas macro variables, &namerow = 2 and &datarow=4
  2. Import the data WITHOUT GETNAMES, so that the variable names will consistently be temp names
  3. Create an ID on your table using monotonic() or an incrementing data step
  4. Write a macro, or an elegant SAS INTO: statement, to rename the variables based on &namerow
  5. Write a data step that overwrites the current dataset, where the ID is less than &datarow
  6. Profit
Trusted Advisor
Posts: 1,610

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

In other words, exactly what I said, there is no easy way to do this.

Frequent Contributor
Posts: 93

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

 

Perhaps you are right, Paige.  

 

Tarun, if you find that this is not an easy solution and you would like some assistance I would be happy to assist!  I have conducted similar work before and it would be a pleasure for me to help you out.

 

 

Patrick

Occasional Contributor
Posts: 17

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

thank you thomp,yes i need some easy way like we do in xls so that i need to just change number for header row and start row
Frequent Contributor
Posts: 93

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

For your perusal.  I am sure there may be some efficiencies you could add but I only took about 20 min coding time with it.

 

Patrick

 

/*IMPORT YOUR DATA, WITH GETNAMES = NO AND MIXED=YES*/
PROC IMPORT OUT= WORK.HAVE 
            DATAFILE= "C:\Users\USERNAME\Documents\have.xls" 
            DBMS=EXCEL REPLACE;
     RANGE="Sheet1$";  
     GETNAMES=NO;
     MIXED=YES;
     SCANTEXT=YES;
     USEDATE=YES;
     SCANTIME=YES;
RUN;

/*SET YOUR MACRO VARIABLES*/
%LET NAMEROW = 2;
%LET DATAROW = 4;

/*INSERT THE ROWID*/
PROC SQL;
CREATE TABLE HAVE_NEW AS
SELECT MONOTONIC() AS ID, * FROM HAVE;
QUIT;

/*GET ALL OF THE COLUMN NAMES IN THE HAVE SET*/
PROC SQL;
CREATE TABLE COLNAMES AS
SELECT NAME FROM DICTIONARY.COLUMNS WHERE MEMNAME = 'HAVE_NEW' AND NAME NOT IN ('ID');
QUIT;

/*SET THE LENGTH OF THE BLANK NEWNAME COLUMN*/
DATA COLNAMES;
SET COLNAMES;
LENGTH NEWNAME $ 80;
FORMAT NEWNAME $8.; 
RUN;

/*CREATE A MACRO THAT WE WILL USE TO UPDATE NEWNAME WITH THE SPECIFIED CRITERIA*/
%MACRO ROWNAMES(COLNAME);
PROC SQL;
UPDATE COLNAMES SET NEWNAME = (SELECT &COLNAME FROM HAVE_NEW WHERE ID = &NAMEROW) WHERE NAME = '%STR(&COLNAME)';
QUIT;
%MEND;

/*EXECUTE THE MACRO FOR EACH OF THE OLD NAMES*/
DATA COLNAMES;
SET COLNAMES;
CALL EXECUTE("%ROWNAMES("||TRIM(NAME)||")");
RUN;

/*CREATE A MACRO VARIABLE WITH A LIST OF EACH OF THE OLD AND NEW COLUMN NAMES
, WHICH WE WILL USE IN THE RENAME STATEMENT*/
PROC SQL;
SELECT TRIM(NAME)||"="||TRIM(NEWNAME) INTO :NEWVAR SEPARATED BY " " FROM COLNAMES;
QUIT;

/*CREATE A MACRO VARIABLE WITH A LIST OF EACH OF THE NEW COLUMN NAMES
, WHICH WE WILL USE IN THE LABEL STATEMENT*/
PROC SQL;
SELECT TRIM(NEWNAME)||"="||TRIM(NEWNAME) INTO :NEWLABEL SEPARATED BY " " FROM COLNAMES;
QUIT;
 
/*CHANGE THE NAMES, AND THE LABELS, AND ONLY SELECT DATA THAT ARE GE OUR DATAROW*/
DATA HAVE_NEW;
SET HAVE_NEW(RENAME=(&NEWVAR));
WHERE ID >= &DATAROW;
LABEL &NEWLABEL;
RUN;
Super User
Posts: 9,671

Re: Importing a .xlsx file into SAS with header in 2 row and data in 4 row

How about .

 

 

proc import datafile='/folders/myfolders/have.xlsx' out=have1 dbms=xlsx replace;
run;

proc import datafile='/folders/myfolders/have.xlsx' out=want dbms=xlsx replace;
datarow=4;
run;

proc transpose data=have1(obs=1) out=temp;
 var _all_;
run;

data _null_;
 set temp end=last;
 if _n_=1 then call execute('proc datasets library=work nolist nodetails;modify want;rename ');
 call execute(catx('=',_name_,col1));
 if last then call execute(';quit;');
run;
Ask a Question
Discussion stats
  • 9 replies
  • 119 views
  • 0 likes
  • 4 in conversation