DATA Step, Macro, Functions and more

How do I access multiple datasets with similar names in multiple folders ?

Reply
New Contributor
Posts: 3

How do I access multiple datasets with similar names in multiple folders ?

Hi everyone,

 

I'm using SAS 9.4.

In a folder called 'test' there are multiple folders labeled test_ with the date of the folder (for example test_20160525). Each of these folders contains one sas dataset with the same name as the folder it is in (so dataset 'test_20160525' is in a folder called 'test_20160525'). Is there a way to write a libname statement that would be able to include all of these datasets? Does this require macro?

 

Thanks!

 

 

Super User
Posts: 23,244

Re: How do I access multiple datasets with similar names in multiple folders ?

You can direct a libname to multiple directories, but what are you trying to do with these datasets? Are they SAS data sets or text files?

 

You may not need a macro but again, it depends on what you want to do with each file...

 

New Contributor
Posts: 3

Re: How do I access multiple datasets with similar names in multiple folders ?

They are all SAS datasets with the same variables. I am trying to combine all of these datasets in order to be able to compare a certain variable over time. I have about 50 datasets over a two year period. 

Super User
Super User
Posts: 7,932

Re: How do I access multiple datasets with similar names in multiple folders ?

[ Edited ]

If  you had them all in the same folder it would be easy.  You could make a libref that points to the folder and then use : wildcard in the SET statement.

data want ;
  set mylib.test_: ;
run;

You could even do that with your current structure, but you would need to list all of the folders when defined the libref.

libname mylib 
('/dir/test_20160525'
 '/dir/test_20160625'
....
);

The easiest way to find the list of files is if you are allowed to use operating system commands. Especially on Unix. So you might do something like this to make a dataset with all of the names.

data files;
  infile 'ls -d /dir/test_*/test_*.sas7bdat' pipe truncover ;
  input filename $200.;
run;

If you cannot call operating system commands to get the list of filenames then perhaps you can just generate all possible dates and check if the file exists?

data files ;
  length filename $200;
  do date='01MAY2016'd to today();
    filename=cats('/dir/test_',put(date,yymmddn8.),'/test_',put(date,yymmddn8.),'.sas7bdat);
    if fileexist(filename) then output;
  end;
run;

 

Once you have a dataset with all of the filenames you could just use that directly to generated references to the datasets.  You can use a quoted physical filename to reference a dataset without having to create a libref.  So if the list is short enough you could do something like this to generate a macro variable with all of the names and then use that macro variable in a SET statement.

proc sql noprint;
  select quote(trim(filename)) 
    into :dslist separated by ' '
  from files
  ;
quit;
data want ;
  set &dslist ;
run;

 If the list is too long then generate the code to a file and %INCLUDE it.

filename code temp;
data _null_;
  set files end=eof;
  file code ;
  if _n_=1 then put 'set ';
  put ' ' filename :$quote. ;
  if eof then put ';'
run;
data want ;
  %include code;
run;
New Contributor
Posts: 3

Re: How do I access multiple datasets with similar names in multiple folders ?

I was trying to do it without having to put them all in the same folder. This is very helpful though! Thanks!!

Super User
Posts: 23,244

Re: How do I access multiple datasets with similar names in multiple folders ?

I think you want Tom's second approach.

 

1. Use SAS to navigate the folder and return all file paths with a SAS7BDAT extension

2. Create those paths into a macro variable/list and then use them in a single SET statement to combine them. 

 

 

Contributor
Posts: 54

Re: How do I access multiple datasets with similar names in multiple folders ?

Is there a "period" column in your each dataset ? If there is no, then you must create one from the dataset name, before joining all datasets. If you dont do this
you cannot compare data each other. Because all of them mixes.
If you dont want to understand complex issues or if you do this work for one time, you can create libname statements in excel and then paste it to sas. Then join all and use proc sql drop table statement... Or if there is awant batch work or automatically process i used to write a macro which contains do-loop.and compare and export result.
Super User
Posts: 23,244

Re: How do I access multiple datasets with similar names in multiple folders ?


Yavuz wrote:
Is there a "period" column in your each dataset ? If there is no, then you must create one from the dataset name, before joining all datasets. If you dont do this you cannot compare data each other. Because all of them mixes.

 

 

There's the INDSNAME option that can be used in this situation, assuming SAS 9.3+. It will capture the name of the source file and then you can parse it to get the date from the file name. Here's a useless example of how that works:

 

data example;
set sashelp.class sashelp.cars indsname=source;

file_input = source;

run;

 

@maryam1 you can use the methods I've illustrated in the link below, or Tom's code above which is a bit cleaner than mine for sure. 

 

https://github.com/statgeek/SAS-Tutorials/blob/master/Import_all_files_one_type

 

1. Call the macro with 

%list_files(C:\_localdata\temp, sas7bdat);

This will create a data set called LIST that has the list of file names. 


Then you can either proceed with the option from Tom's earlier post, use CALL EXECUTE to generate your Data Step combine statement or use those to assign a single libname to all folders. 

 

 

Ask a Question
Discussion stats
  • 7 replies
  • 173 views
  • 2 likes
  • 4 in conversation