Issues is: file sits under multiple folder.
For example parent path is: 'C:\users\manohar\downloads\data' then we have multiple folder with name of date as.
example 01may2023, 02may2023,03may2023, under this folder. I have files named as
Each folder have 1 file for example for folder 01may2023 file name is Trans 250423.xls
02may2023 file name is trans 260423.xls
03may 2023 folder file name is trans 270423.xls
Can some one help me here. how to import and read this file, I have to read from 2018 onwards.
1) what is the first file name you have to read? and in what folder ist it?
2) Do the folders represent all possible dates since '01jan2018' in sequence?
3) can you show a formula to calculate the excel filename using the folder name as input argument?
4) Having formula, as asked above, enables generate a SAS program to import each file separately, then to keep them as separate sas data sets or concatenate them to one big data set.
This question is discussed in the following threads before, please check:
https://communities.sas.com/t5/SAS-Programming/Convert-all-XLS-files-in-directory-to-CSV-using-SAS/t...
https://communities.sas.com/t5/SAS-Programming/Get-names-of-all-excel-files-from-multiple-folders/td...
Here you will use Windows command/path to the main folders(which includes subfolders)/file extension (.xls) to read all files regardless the name and subfolder location of the file.
First step it to get the list of files into a dataset. Since you have multiple directories you need something that can handle that.
For example you could use this macro: https://github.com/sasutils/macros/blob/master/dirtree.sas
%dirtree(C:\users\manohar\downloads\data,out=files)
Which will create a dataset with these variables.
Output dataset structure --NAME-- Len Format Description FILENAME $256 Name of file in directory TYPE $1 File or Directory? (F/D) SIZE 8 COMMA20. Filesize in bytes DATE 4 YYMMDD10. Date file last modified TIME 4 TOD8. Time of day file last modified DEPTH 3 Tree depth PATH $256 Directory name
So say you end up with this dataset:
data files;
infile cards dsd ;
input filename :$256. path :$256. ;
cards;
trans 260423.xls,C:\users\manohar\downloads\data\02may2023
trans 270423.xls,C:\users\manohar\downloads\data\03may 2023
;
You could then try to convert the directory name into an actual date.
It also looks like the XLS filename might also have a date value in DDMMYY style, although perhaps not.
data xls_files;
set files;
if upcase(scan(filename,-1,'.')) = 'XLS' ;
dirdate = input(scan(path,-1,'\'),date11.);
trandate = input(scan(filename,-2,' .'),ddmmyy.);
format dirdate trandate yymmdd10.;
run;
So now you have this:
Obs filename path dirdate trandate
1 trans 260423.xls C:\users\manohar\downloads\data\02may2023 2023-05-02 2023-04-26
2 trans 270423.xls C:\users\manohar\downloads\data\03may 2023 2023-05-03 2023-04-27
Now you can filter by one of those calculated DATE values to find the files that represent the date you want.
And then use that list to generate the code to convert the XLS sheets into SAS datasets.
filename code temp;
data _null_;
file code;
set xls_files;
where trandate >= '01JAN2018'd ;
put 'proc import file="' path +(-1) '\' filename +(-1) '" dbms=xls'
' out=trans_' trandate yymmddn8. ' replace;'
'run;'
;
run;
%include code / source2;
Example:
529 +proc import file="C:\users\manohar\downloads\data\02may2023\trans 260423.xls" dbms=xls out=trans_20230426 replace; 529!+ run; 530 +proc import file="C:\users\manohar\downloads\data\03may 2023\trans 270423.xls" dbms=xls out=trans_20230427 replace; 530!+ run;
Be prepared for much heartbreak if you expect to combine any of these data sets after importing into SAS. If these XLS are supposed to have the same content you may be surprised to find that IMPORT creates variables with different type and lengths, and possibly names, as decisions as to the properties of each set are made without regards to content of other files.
You may want to investigate creating CSV files from XLS and then using a data step to read each so that the sets all have the same properties.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.