DATA Step, Macro, Functions and more

Append multiple dataset

Reply
Contributor
Posts: 60

Append multiple dataset

base data set and append data set having different variable length and force option is not working.Is there any solution to get proper variable length?

Super User
Posts: 10,018

Re: Append multiple dataset

Posted in reply to india2016

proc sql;

create table want as

 select *

  from one 

union all corr

 select *

  from two;

quit;

Super User
Posts: 19,770

Re: Append multiple dataset

Posted in reply to india2016

What does not working mean? 

Contributor
Posts: 60

Re: Append multiple dataset

Posted in reply to india2016

Book1

 

Gender
female
male

 

 

Book2

 

Gender
female
transgender
male

 

 

 I have this two datasets in excel format.After importing these two excel sheets I want to append book1 & book2.

If I use append table in SAS EG then I got  "transg"  which takes the length of "female" from book2.

 

What should I do to get the right result?("transg" as transgender)

 

Super User
Posts: 19,770

Re: Append multiple dataset

Posted in reply to india2016

Set dataset 2 as the base and dataset 1 as the data. 

 

Ideally, make your base table with desired characteristics (length, type, format) first and use that in proc append. 

Contributor
Posts: 60

Re: Append multiple dataset

but what if there are many tables and if we don't know characteristics?
Super User
Posts: 19,770

Re: Append multiple dataset

Posted in reply to india2016

1. Compare all datasets and determine max length required using Dictionary tables. Create a Base dataset with these characteristics that's empty to start. 

2. Append using a data step or SQL union. You may still run into truncation issues. 

3. Create Base data set with arbitrarily large lengths and truncate later. 

 

I would suggest #1. 

Contributor
Posts: 60

Re: Append multiple dataset

we have 128 excel files in each file there are 20 variables? 

Super User
Posts: 11,343

Re: Append multiple dataset

Posted in reply to india2016

india2016 wrote:

we have 128 excel files in each file there are 20 variables? 


This tells me that you are likely using Proc Import or similar that lets EXCEL controll the reported lengths of variables.

One approach if you know, or have a reasonable guess as to the lengths the values should take is to make a dummy set with the maximum length and append each one to that.

 

Data dummy;

   length var1 $ 25 var2 var3 $ 10 ;

   output;

run;

will create a data set with 3 variables with the lengths indicated.

 

I you have many data sets and the structures are the same you may want to use a SET statement to combine all of them at the same time:

 

data want;

     set DUMMY

           dataset1

           datasetq

           datasetabc;

run;

 

If you name the sets nicely you can even use a short cut list:

 

data want;

    set dummy dataset1 - dataset20;

run;

Valued Guide
Posts: 505

Re: Append multiple dataset

Taking care to get the maximum length of a excel string before importing

Taking care to get the maximum length of a excel string before importing

inspired by
https://goo.gl/fiZXXR
https://communities.sas.com/t5/Base-SAS-Programming/Append-multiple-dataset/m-p/344558

Issue
I have this two datasets in excel format.After importing these
two excel sheets I want to append book1 & book2.
If I use append table in SAS EG then I got  "transg"
which takes the length of "female" from book2.


You may sometimes get away without specifying character length, but it is
better to passthru and check the lengths. I suspect
odbc passthru may only support length upto 255 bytes.
No sure of OLE-DB. You ned to passthru for this issue?

HAVE ( two sheets in one workbook )
====================================


d:/xls/books.xlsx sheet BOOK1

    +------------------+
    |        A         |
    +------------------+
1   |      BOOK        |
    +------------------+
2   |     Gender       |
    +-------------------
3   |     female       |
    +-------------------
4   |      male        |
    +-------------------

[BOOK1]


d:/xls/books.xlsx sheet BOOK2

    +------------------+
    |        A         |
    +------------------+
1   |      BOOK        |
    +------------------+
2   |     Gender       |
    +-------------------
3   |     female       |
    +-------------------
4   |  transgender     |
    +-------------------
4   |      male        |
    +-------------------

[BOOK2]

WANT   (following SAS dataset )
================================

p to 40 obs from want total obs=7

bs    BOOK

1     Gender
2     female
3     male
4     Gender
5     female
6     transgender
7     male

WORKING CODE

      connect to excel (Path="d:\xls\books.xlsx");
      select max(len(book)) as lenmax

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;


%utlfkil(d:/xls/books.xlsx); * delete if exists;
libname xel "d:/xls/books.xlsx";

data xel.book1;
  informat book $11.;
  input Book ;
cards4;
Gender
female
male
;;;;
run;quit;


data xel.book2;
  informat book $11.;
  input Book ;
cards4;
Gender
female
transgender
male
;;;;
run;quit;

*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

* notes
   you need bckslashes for this to work d:\xls\books.xlsx
   if uncomfortable with DOSUBL just pull the code out
   and run outside the datastep
;


%symdel maxlen;
libname xel "d:/xls/books.xlsx";
data want;

  if _n_=0 then do;
    %let rc=%sysfunc(dosubl(%nrbquote(
        proc sql dquote=ansi;
           connect to excel (Path="d:\xls\books.xlsx");
             select max(lenmax) into :maxlen separated by ' ' from connection to Excel
                 (
                  Select
                       max(len(book)) as lenmax
                  from
                       book1
                  union
                  select
                       max(len(book)) as lenmax
                  from
                       book2
                 );
             disconnect from Excel;
        quit;
        libname xel "d:/xls/books.xlsx";
        run;quit;
       )));
   end;

   set
     xel.book1(dbsastype=(book="char(&maxlen.)"))
     xel.book2(dbsastype=(book="char(&maxlen.)"))
   ;
run;quit;

libname xel clear;



                Variables in Creation Order

#    Variable    Type    Len    Format    Informat    Label

1    BOOK        Char     11    $6.       $6.         BOOK



Valued Guide
Posts: 505

Re: Append multiple dataset

Posted in reply to rogerjdeangelis
If you do not have named ranges book1, book2 use
'[shee1$]' within the passthru
Ask a Question
Discussion stats
  • 10 replies
  • 241 views
  • 3 likes
  • 5 in conversation