10-27-2017 04:21 PM
I'm using SAS 9.4.
In a folder called 'test' there are multiple folders labeled test_ with the date of the folder (for example test_20160525). Each of these folders contains one sas dataset with the same name as the folder it is in (so dataset 'test_20160525' is in a folder called 'test_20160525'). Is there a way to write a libname statement that would be able to include all of these datasets? Does this require macro?
10-27-2017 05:08 PM
You can direct a libname to multiple directories, but what are you trying to do with these datasets? Are they SAS data sets or text files?
You may not need a macro but again, it depends on what you want to do with each file...
10-27-2017 05:50 PM
They are all SAS datasets with the same variables. I am trying to combine all of these datasets in order to be able to compare a certain variable over time. I have about 50 datasets over a two year period.
10-27-2017 05:19 PM - edited 10-27-2017 05:27 PM
If you had them all in the same folder it would be easy. You could make a libref that points to the folder and then use : wildcard in the SET statement.
data want ; set mylib.test_: ; run;
You could even do that with your current structure, but you would need to list all of the folders when defined the libref.
libname mylib ('/dir/test_20160525' '/dir/test_20160625' .... );
The easiest way to find the list of files is if you are allowed to use operating system commands. Especially on Unix. So you might do something like this to make a dataset with all of the names.
data files; infile 'ls -d /dir/test_*/test_*.sas7bdat' pipe truncover ; input filename $200.; run;
If you cannot call operating system commands to get the list of filenames then perhaps you can just generate all possible dates and check if the file exists?
data files ; length filename $200; do date='01MAY2016'd to today(); filename=cats('/dir/test_',put(date,yymmddn8.),'/test_',put(date,yymmddn8.),'.sas7bdat); if fileexist(filename) then output; end; run;
Once you have a dataset with all of the filenames you could just use that directly to generated references to the datasets. You can use a quoted physical filename to reference a dataset without having to create a libref. So if the list is short enough you could do something like this to generate a macro variable with all of the names and then use that macro variable in a SET statement.
proc sql noprint; select quote(trim(filename)) into :dslist separated by ' ' from files ; quit; data want ; set &dslist ; run;
If the list is too long then generate the code to a file and %INCLUDE it.
filename code temp; data _null_; set files end=eof; file code ; if _n_=1 then put 'set '; put ' ' filename :$quote. ; if eof then put ';' run; data want ; %include code; run;
10-27-2017 06:06 PM
I think you want Tom's second approach.
1. Use SAS to navigate the folder and return all file paths with a SAS7BDAT extension
2. Create those paths into a macro variable/list and then use them in a single SET statement to combine them.
10-27-2017 07:46 PM
10-27-2017 08:35 PM
Is there a "period" column in your each dataset ? If there is no, then you must create one from the dataset name, before joining all datasets. If you dont do this you cannot compare data each other. Because all of them mixes.
There's the INDSNAME option that can be used in this situation, assuming SAS 9.3+. It will capture the name of the source file and then you can parse it to get the date from the file name. Here's a useless example of how that works:
data example; set sashelp.class sashelp.cars indsname=source; file_input = source; run;
@maryam1 you can use the methods I've illustrated in the link below, or Tom's code above which is a bit cleaner than mine for sure.
1. Call the macro with
This will create a data set called LIST that has the list of file names.
Then you can either proceed with the option from Tom's earlier post, use CALL EXECUTE to generate your Data Step combine statement or use those to assign a single libname to all folders.