BookmarkSubscribeRSS Feed
ManoharNath
Obsidian | Level 7

Issues is: file sits under multiple folder.

For example parent path is: 'C:\users\manohar\downloads\data' then we have multiple folder with name of date as.

example 01may2023, 02may2023,03may2023, under this folder. I have files named as 

Each folder have 1 file for example for folder 01may2023 file name is Trans 250423.xls

02may2023 file name is trans 260423.xls

03may 2023 folder file name is trans 270423.xls

 

Can some one help me here. how to import and read this file, I have to read from 2018 onwards.

 

4 REPLIES 4
Shmuel
Garnet | Level 18

1) what is the first file name you have to read? and in what folder ist it?

2) Do the folders represent all possible dates since '01jan2018' in sequence?

3) can you show a formula to calculate the excel filename using the folder name as input argument?

4) Having formula, as asked above, enables generate a SAS program to import each file separately, then to keep them as separate sas data sets or concatenate them to one big data set.

 

A_Kh
Lapis Lazuli | Level 10

This question is discussed in the following threads before, please check:
https://communities.sas.com/t5/SAS-Programming/Convert-all-XLS-files-in-directory-to-CSV-using-SAS/t...
https://communities.sas.com/t5/SAS-Programming/Get-names-of-all-excel-files-from-multiple-folders/td...

Here you will use Windows command/path to the main folders(which includes subfolders)/file extension (.xls) to read all files regardless the name and subfolder location of the file. 

Tom
Super User Tom
Super User

First step it to get the list of files into a dataset.  Since you have multiple directories you need something that can handle that.

 

For example you could use this macro:  https://github.com/sasutils/macros/blob/master/dirtree.sas

 

%dirtree(C:\users\manohar\downloads\data,out=files)

 

Which will create a dataset with these variables.

Output dataset structure
--NAME-- Len  Format      Description
FILENAME $256             Name of file in directory
TYPE       $1             File or Directory? (F/D)
SIZE        8 COMMA20.    Filesize in bytes
DATE        4 YYMMDD10.   Date file last modified
TIME        4 TOD8.       Time of day file last modified
DEPTH       3             Tree depth
PATH     $256             Directory name

 

So say you end up with this dataset:

data files;
  infile cards dsd ;
  input filename :$256. path :$256. ;
cards;
trans 260423.xls,C:\users\manohar\downloads\data\02may2023
trans 270423.xls,C:\users\manohar\downloads\data\03may 2023
;

 

You could then try to convert the directory name into an actual date.

It also looks like the XLS filename might also have a date value in DDMMYY style, although perhaps not.

data xls_files;
  set files;
  if upcase(scan(filename,-1,'.')) = 'XLS' ;
  dirdate = input(scan(path,-1,'\'),date11.);
  trandate = input(scan(filename,-2,' .'),ddmmyy.);
  format dirdate trandate yymmdd10.;
run;

So now you have this:

Obs        filename                           path                          dirdate      trandate

 1     trans 260423.xls    C:\users\manohar\downloads\data\02may2023     2023-05-02    2023-04-26
 2     trans 270423.xls    C:\users\manohar\downloads\data\03may 2023    2023-05-03    2023-04-27

Now you can filter by one of those calculated DATE values to find the files that represent the date you want.

And then use that list to generate the code to convert the XLS sheets into SAS datasets.

filename code temp;
data _null_;
  file code;
  set xls_files;
  where trandate >= '01JAN2018'd ;
  put 'proc import file="' path +(-1) '\' filename +(-1) '" dbms=xls'
      ' out=trans_' trandate yymmddn8. ' replace;'
      'run;'
  ;
run;
%include code / source2;

Example:

529 +proc import file="C:\users\manohar\downloads\data\02may2023\trans 260423.xls" dbms=xls out=trans_20230426 replace;
529!+                                                                                                                  run;

530 +proc import file="C:\users\manohar\downloads\data\03may 2023\trans 270423.xls" dbms=xls out=trans_20230427 replace;
530!+                                                                                                                   run;
ballardw
Super User

Be prepared for much heartbreak if you expect to combine any of these data sets after importing into SAS. If these XLS are supposed to have the same content you may be surprised to find that IMPORT creates variables with different type and lengths, and possibly names, as decisions as to the properties of each set are made without regards to content of other files.

 

You may want to investigate creating CSV files from XLS and then using a data step to read each so that the sets all have the same properties.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 576 views
  • 2 likes
  • 5 in conversation