Good Morning,
The goal of our new project is to make a list of all SAS files and perhaps text files contained in the main directories and all subdirectories. Then from that new list (put in a new dataset) , you must obtain the list of all the variables of each dataset of this list including the text file variables and put those variables into another datasets (global list). Subsequently, using this global list, you must scan certain key variables and retain the files containing these key variables in another table (new sas dataset).
The final goal is to trace any dataset that containts some policy information such as expired_date, expired_dt and so on and may be other key variable. Thereafter to delete from the actual dataset the policies which are expired since five years.
I have in mind a mix of Unix command and SAS Script.
The first steps is really to get the files listing with their path, file extension.
From there, I can use a macro functin to get the variables in each datasets, eliminate duplicate and so on.
How to get the variables list from a texte file ?
Seems like a homework assignment. We won't do your homework for you, but if you have created some code and it doesn't work, we'd be happy to help.
You could search these forums for posts about how to list all files in a folder, it has been discussed many times. You could search these forums for how to find all SAS datasets in a library, and how to find all variables in those SAS data sets.
Regarding combination of UNIX and SAS, I think the whole thing can be done in SAS.
What is a "text file variable"?
Hello,
See the two pinned topic threads in the Programming-board. (they are on top because they are pinned).
You probably also want to consult the dictionary tables!
Paper 070-30.
(From SUGI 30. SUGI 30 conference was held on April 10 - 13, 2005, in Pennsylvania Convention Center, Philadelphia, Pennsylvania)
Exploring DICTIONARY Tables and Views
Kirk Paul Lafler, Software Intelligence Corporation
https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/070-30.pdf
Koen
Note that I have used Roger's trick from that first thread of using MODIFY to implement a recursive directory search into this macro that is available on GITHUB.
https://github.com/sasutils/macros/blob/master/dirtree.sas
You've asked a lot of questions, but for this one:
How to get the variables list from a text file ?
It really depends on the text file. Some text files start with a row that has the variable names, some do not. So I would start by opening some of your text files to see what is in them. If they have variable names in the first row, then you use SAS or whatever script to read the first row of each text file to get the names.
If you're good at unix commands it's reasonable to use FIND or whatever command to get the a list of all of the .sas7bdat and .txt files in a directory, recursively. If you pipe that list to a file you can then read that file and use it as a driver to process each SAS dataset / text file. Assuming your SAS has XCMD enabled, you can use SAS to run the unix command.
You could do it purely from SAS, but in my experience while the recursive directory search is an interesting exercise, it's not the easiest thing to code up. For a pure-SAS example, see e.g.: https://communities.sas.com/t5/SAS-Programming/listing-all-files-within-a-directory-and-subdirectori...
If you've got a SAS environment that allows for XCMD (OS commands) then creating a file listing is really simple with the Unix find command.
Based on some code I had "lying around" below should be close to create a table with all the *.sas7bdat files under a folder and all sub-folders. I had to amend the find command from my original version - but the syntax should be at least close to what you need and will actually work.
%let root_folder=/folder1/folder2;
data work.sas_files;
infile "find &root_folder -type f -name ""*.sas7bdat"" -printf %nrstr('%h|%f\n')" pipe end=done truncover dlm='|';
do until(done);
input path :$512. filename :$41.;
output;
end;
run;
Once you've got all the SAS files in a table you then can use the distinct list of paths to create libnames - either using SAS macro coding or what I would do a data step with the data step libname function.
For example you could define librefs like f_01 to f_<nn> (one per distinct path).
Once you've got all these librefs a simple query of dictionary.tables and/or dictionary.columns provides all the info you're after.
proc sql;
create table column_metadata as
select *
from dictionary.columns
where libname like 'F^_%' escape '^'
;
quit;
It's a totally different ball game with text files. Sure you could create a list of all the .txt files but then how are you going to analyse them? If you can't be sure that they all follow some defined structure then you're basically dealing with free-text and all the challenges this entails.
If the .txt files don't follow some common pre-defined structure then you need some sort of text analytics to answer your questions.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.