Desktop productivity for business analysts and programmers

append all datasets from 1 library with new variable per set

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

append all datasets from 1 library with new variable per set

Dear all,

 

I'm trying to do something smart but I'm failing and do not know how to fix it. I have (several) xlsx files with multimple sheets. For this example I'm importing 1 xlsx file and only to sheets ( in reality it has 5).

 

Alfter the import I want to append them so that I can transpose them later from 1 table. I first tried the set option but I do not know how to make it flexible, since I do not want to manually add the sheet names one more (OPTION 1).

 

Then I've found the code for OPTION 2, but then I get a warning that not all my variables are added  due to : Variable was not found on BASE file. The variable will not be added to the BASE file. (see attachment)

 

Any suggestions?

/* Import all sheets from excel file. For this example only 2 */


%macro imp(sheet = );

PROC IMPORT OUT=SAS_TMP.&sheet

 datafile = "\\DWCPMCA\Data\1_INPUTDOCS\IMPDOC\IL.xlsx"

DBMS=xlsx REPLACE;

 sheet="&sheet.";

GETNAMES=YES;

run;

%mend imp;

%imp(sheet=V100_IL)

%imp(sheet=V350_IL)




/* OPTION 1: append using set */


data SAS_TMP.want;

set sas_tmp.V100_IL sas_tmp.V350_IL;

run; 




/* OPTION 2: using proc append */


proc sql;

select count(*) into : obs from sashelp.vtable

where libname = 'SAS_TMP';

%LET OBS=&OBS;

SELECT catx('.', 'SAS_TMP', memname) INTO : TAB1-:TAB&OBS FROM sashelp.vtable

where libname = 'SAS_TMP';

QUIT;

%MACRO append;

%DO i=1 %to &obs;

proc append base=SAS_TMP.tot_comb data=&&tab&i force;run;

%end;

%mend;

%append;

proc print data=SAS_TMP.tot_comb;

run; 

Accepted Solutions
Solution
‎01-10-2017 03:24 PM
Esteemed Advisor
Esteemed Advisor
Posts: 7,249

Re: append all datasets from 1 library with new variable per set

Hi,

 

So:

Opt1:

/* Assumes all dataset in library sas_tmp with prefix V are to be set together - note the ":"

data SAS_TMP.want;

  set sas_tmp.V:;       

run;

 

For your code:

%macro imp(sheet = );
  proc import out=sas_tmp.&sheet. datafile="\\dwcpmca\data\1_inputdocs\impdoc\il.xlsx" dbms=xlsx replace;
    sheet="&sheet.";
    getnames=yes;
  run;
%mend imp;
%imp(sheet=v100_il);
%imp(sheet=v350_il);

You could also replace this with

libname tmp excel "\\dwcpmca\data\1_inputdocs\impdoc\il.xlsx";

ASsuming you have SAS 9.4.  Then you could access the sheets directly from tmp library.

 

For this code:

proc sql;
  select  count(*) 
  into    :OBS from SASHELP.VTABLE
  where   LIBNAME='SAS_TMP';
  select  catx('.','SAS_TMP',MEMNAME)
  into    :TAB1-:TAB&OBS.
  from    SASHELP.VTABLE
  where   LIBNAME='SAS_TMP';
quit;
 
%macro append;
  %do i=1 %to &obs.;
    proc append base=sas_tmp.tot_comb data=&&tab&i force;
    run;
  %end;
%mend;
%append;

You can simplify this by:

data _null_;
  set sashelp.vtable (where=(libname="SAS_TMP"));
  call execute(cats('proc append base=sas_tmp.tot_comb data=',memname,' force;run;'));
run;

However, this still doesn't cover the inherent problems with a) Excel, b) Import methods.  

View solution in original post


All Replies
Esteemed Advisor
Esteemed Advisor
Posts: 7,249

Re: append all datasets from 1 library with new variable per set

Seriously, I mean this in the best possible way, how am I supposed to read that mess of text?  Follow some sort of good programming practice, lower case code, new line for each item, indent where necessary etc.

 

First question, how will you know that each sheet is the same strcuturally and after import?  I can guarantee you that importing data from Excel - which is a terrible data medium - will give you different results every time you do it.  Also using proc import - which is a guessing procedure - may guess your data differently each time.  Hence there is 99% chance that each sheet at every import will have a different strcuture.  This will cause you problems appending the data.

 

My suggestion, get your data out into a proper data transfer format - csv/delimted text, xml etc.  Then write a datastep import and specify the data import structure - that you know as its your data.  That way you can reduce the possiblilties that the data is different (either content or structural). 

 

Once you have sorted that part of the process out, then you can look at appending all the data, simply querying the sashelp.vtable view you can identify all the datasets, then loop over them and append them.  you could also at the import stage read all the files at once using wildards for instance.

Super User
Posts: 1,243

Re: append all datasets from 1 library with new variable per set

I don't see any ERROR message in your log you posted.

What is the problem ? 

You used the FORCE option in PROC APPEND so all new variables should apear in the agregated output dataset.

 

The only problem can be that text variables could be trancated to their first length occirence.

Super User
Super User
Posts: 6,365

Re: append all datasets from 1 library with new variable per set

The FORCE option just let's you append from dataset that have extra variables. It does NOT add the variables, that would totally go against the point of PROC APPEND it that it would require re-writing the BASE dataset.  If you want to add new variables just use a data step instead of PROC APPEND.

 

data want ;
   set BASE DATA ;
run;
Grand Advisor
Posts: 10,229

Re: append all datasets from 1 library with new variable per set


Tom wrote:

The FORCE option just let's you append from dataset that have extra variables. It does NOT add the variables, that would totally go against the point of PROC APPEND it that it would require re-writing the BASE dataset.  If you want to add new variables just use a data step instead of PROC APPEND.

 

data want ;
   set BASE DATA ;
run;

And if you have imported all of the sets then

data want ;
   set BASE DATA1 data2 data3 ... datan ;
run;

And when you get the almost inevitable errors or warnings about mismatched data types or lengths of varaibles refer to @RW9's post.

 

Occasional Contributor
Posts: 6

Re: append all datasets from 1 library with new variable per set

Hi RW9,

I do not know what happend with my formatting yesterday. Here again the code. Hopefully this time it is readable. I will meanwhile read the proposed solutions.

 

/* Import all sheets from excel file. For this example only 2 */

%macro imp(sheet = );

PROC IMPORT OUT=SAS_TMP.&sheet

datafile = "\\DWCPMCA\Data\1_INPUTDOCS\IMPDOC\IL.xlsx"

DBMS=xlsx REPLACE;

sheet="&sheet.";

GETNAMES=YES;

run;

%mend imp;

%imp(sheet=V100_IL)

%imp(sheet=V350_IL)

 

/* OPTION 1: append the imported sheets using set */ << HOW can I make the set more automated?>>

data SAS_TMP.want;

set sas_tmp.V100_IL sas_tmp.V350_IL;

run;

 

 

/* OPTION 2: using proc append and reading all the imported sheets into my SAS_TMP lib */

proc sql;

select count(*) into : obs from sashelp.vtable

where libname = 'SAS_TMP';

%LET OBS=&OBS;

SELECT catx('.', 'SAS_TMP', memname) INTO : TAB1-:TAB&OBS FROM sashelp.vtable

where libname = 'SAS_TMP';

QUIT;

 

%MACRO append;

%DO i=1 %to &obs;

proc append base=SAS_TMP.tot_comb data=&&tab&i force;run;

%end;

%mend;

%append;

 

Solution
‎01-10-2017 03:24 PM
Esteemed Advisor
Esteemed Advisor
Posts: 7,249

Re: append all datasets from 1 library with new variable per set

Hi,

 

So:

Opt1:

/* Assumes all dataset in library sas_tmp with prefix V are to be set together - note the ":"

data SAS_TMP.want;

  set sas_tmp.V:;       

run;

 

For your code:

%macro imp(sheet = );
  proc import out=sas_tmp.&sheet. datafile="\\dwcpmca\data\1_inputdocs\impdoc\il.xlsx" dbms=xlsx replace;
    sheet="&sheet.";
    getnames=yes;
  run;
%mend imp;
%imp(sheet=v100_il);
%imp(sheet=v350_il);

You could also replace this with

libname tmp excel "\\dwcpmca\data\1_inputdocs\impdoc\il.xlsx";

ASsuming you have SAS 9.4.  Then you could access the sheets directly from tmp library.

 

For this code:

proc sql;
  select  count(*) 
  into    :OBS from SASHELP.VTABLE
  where   LIBNAME='SAS_TMP';
  select  catx('.','SAS_TMP',MEMNAME)
  into    :TAB1-:TAB&OBS.
  from    SASHELP.VTABLE
  where   LIBNAME='SAS_TMP';
quit;
 
%macro append;
  %do i=1 %to &obs.;
    proc append base=sas_tmp.tot_comb data=&&tab&i force;
    run;
  %end;
%mend;
%append;

You can simplify this by:

data _null_;
  set sashelp.vtable (where=(libname="SAS_TMP"));
  call execute(cats('proc append base=sas_tmp.tot_comb data=',memname,' force;run;'));
run;

However, this still doesn't cover the inherent problems with a) Excel, b) Import methods.  

Occasional Contributor
Posts: 6

Re: append all datasets from 1 library with new variable per set

Thanks. This helps me further. The XLS en Import problems will be tackled in a next phase.
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 235 views
  • 3 likes
  • 5 in conversation