BookmarkSubscribeRSS Feed
kjowers
Fluorite | Level 6

Hi All,

 

I'm a relatively new SAS 9.3 user, so please bear with me.

 

I have the following library structure:

 

 

libname jc_&jcname 'W:\JCs\JC Analyses 2016-2017\JC Data 2016-2017\Received from JCs\&jcname\JC_2016-2017_&jcname.xls';

There are 63 &jcname values, so it follows that there are 63 folders and 63 files, all with the same variables in the same order.

 

Can anyone please help me with extracting those files and stacking them into a single file? Had they been in the same location, this would have worked:

 

data all;
     set JC_2016_jcA
         JC_2016_jcB
         JC_2016_jcC
         JC_2016_jcD
         JC_2016_jcE
         JC_2016_jcF  ...
    ...  JC_2016_jcBJ;
run;

but, yeah, they're not...

 

Thanks in advance!!!

 

Doctor J
6 REPLIES 6
ballardw
Super User

Don't you get an error with that filename? Such as file or path doesn't exist?

Or did you actually mean to post:

 

libname jc_&jcname "W:\JCs\JC Analyses 2016-2017\JC Data 2016-2017\Received from JCs\&jcname.\JC_2016-2017_&jcname..xls";

 

The macro variable wouldn't resolve inside single quotes and without the periods added in red above you would get a macro error because it would think that you want to use &jcnameJC_2016-2017_ as a macro variable and without the second red period the period denoting the start of the file extension would have been resolved as concatenating &jcnamexls.

 

You likely also need something on the libname to indicate the engine to read the file.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, firstly, no matter how properly you have defined the Excel files, or how similar they are, if you use a guessing import procedure (i.e. you don't specify data structure) from a bad data file format such as Excel, you will end up with data which is not the same. I am just mentioning this as you will encounter it going forward.

As for doing the code:

/* Import first one to get a base */
libname tmp excel "c:\folder\',JC_2016_jcA,'.xlsx"; 
data base'
  set tmp.JC_2016_jcA;
run;

data _null_;
  do f="JC_2016_jcb","JC_2016_jcc"...;
    call execute(cats('libname tmp excel "c:\folder\',f,'.xlsx"; proc append base=base data=tmp.',f,' force; run;'));
  end;
run;
kjowers
Fluorite | Level 6

Thank you -- I think I had kind of subconsciously ignored the potential for format disagreement across files. Do you have a suggestion for how to account for that in a procedure similar to the one you propose below?

 

Thanks!

Doctor J
Kurt_Bremser
Super User

Excel files are messy, unstructured heaps of data. Just as proc import, libname excel has to make guesses about the structure, and you will end up with all kinds of WARNINGs because of possibly truncated data.

Imagine that your first Excel file contains strings in column 'D' that have a maximum length of 12.

Now, your third Excel file has a maximum string in 'D' that's 25 bytes long, you'll lose 13 bytes of that, because the first guess used that.

Or the Excel engine goes the safe path and makes all strings 255 bytes long, wasting lots of space for nothing.

 

Therefore you should dump the Excel file format for such operations and switch to something that works. Have the data delivered in csv files, for instance. There you will define the structure in the data step, and it will work. Period.

 

Excel files are NOT, I repeat NOT, suited for such a task. Never ever. They may be nice for a one-shot test, but that'ts it.

kjowers
Fluorite | Level 6

Your first sentence could not be truer! I can't stand using Excel for these purposes. Unfortunately, I work for an agency that doesn't exactly embrace change/efficiency/my sanity. Thank you, though!

Doctor J
Patrick
Opal | Level 21

@kjowers

The following code doesn't circumvent all the issues with Excel sources. What it does:

1. Create a list of all Excel files under a parent folder (files can be in sub-folders)

2. Import first sheet of all these Excels into separate SAS tables in Work

3. Use a SQL OUTER UNION JOIN to combine all the tables into a single table

    - Using a SQL OUTER UNION will first analyse the length of all variables and choose the longest length per variable from all tables.      

   - This will help to avoid truncation issues.

 

%let sourcePath=C:\temp\JCs\JC Analysis 2016-2017\KC Data 2016-2017\Received from JCs;

%macro collectTable(file=,no=);
  proc import 
    file="&file"
    dbms=XLSX
    replace
    out=collectedTable&no
    ;
  run;quit;
%mend;

/* cleanup Work from previous runs */
proc datasets lib=work nolist nowarn;
  delete collectedTable:;
  run;
quit;

/* import first tab from all Excel files found under &sourcePath into WORK */
filename files pipe "dir /b/s/a-d ""&sourcePath\*.xlsx""";
data _null_;
  infile files truncover;
  input thisFile $500.;
  call execute(cats( '%collectTable(file=',thisFile,',no=',_n_,')' ));
run;
filename files;
 

/* consolidate imported data */
%let union_list=;
proc sql noprint;
  /* 1. create macro var with SQL command for UNION Set operation for all tables */
  select catx(' ','select * from',memname) into :union_list separated by ' outer union corr '
  from dictionary.tables
  where libname='WORK' and memname like 'COLLECTEDTABLE%' 
  ;
  
  /* 2. execute generated UNION SET operation */
  create table consolidated as
  &union_list;

quit;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2224 views
  • 1 like
  • 5 in conversation