Hello,
In attempt to tidy up / create space on our database I am wanting to create a list of all our SAS datasets, their size and their location. Ideally I would export this list to Excel so that our team can review the list and state whether they still use the dataset or whether it can be deleted / compressed. It would perhaps therefore be useful to include the 'created date' or 'last updated date' or even the 'last accessed date' or 'last referenced date', if possible.
Is there some SAS code I can use to do this? While trying to work it out, I have come across some proc sql codes (including terms such as memname and dictionary) but I am not too familiar with proc sql, and have become confused and stuck so any help would be appreciated please.
Also, if anyone has had to do something similar, perhaps they have advice on how to go about this. I am a bit nervous about deleting / compressing a SAS dataset that we believe is no longer in use but could still have some hidden / forgotten code linked to it/ using it.
Thanks in anticipation,
Laura
It's already done for you, this is stored in SASHELP.VTABLE. You can view this or print this just as any other SAS data set.
Thanks very much, very helpful!
@leoconnell wrote:
Hello,
In attempt to tidy up / create space on our database I am wanting to create a list of all our SAS datasets, their size and their location.
Please be a bit more accurate: are you talking about sas7bdat-files or tables in a database?
Hi,
Thanks for the reply and sorry that I was not specific enough for you to understand my problem. My query has been resolved in another answer now.
Thanks again,
Laura
@leoconnell - @PaigeMiller 's solution will only work for SAS libraries that are defined in the SAS session you are querying the dictionary table in. What about SAS datasets that are not currently defined in a SAS library but are still stored on disk somewhere?
Ah that is good to know, thank you. Have you a solution to obtaining the information for datasets not currently defined in a SAS library?
Thanks,
Laura
@leoconnell wrote:
Have you a solution to obtaining the information for datasets not currently defined in a SAS library?
You could write a command in SAS that calls to your operating system to find all files named *.sas7bdat. This macro works on Windows.
%macro list_files(dir=,outdsn=files,all_subdirs=N);
data _null_;
if upcase("&all_subdirs")='N' then call symputx('str',cat("'dir ",'"',"%trim(&dir)"" /b'"));
else if upcase("&all_subdirs")='Y' then call symputx('str',cat("'dir ",'"',"%trim(&dir)"" /b /s'"));
run;
filename tmp pipe &str;
data &outdsn;
infile tmp dlm="¬" missover;
input filenames :$200.;
run;
%mend;
/* Example of use: */
/*%list_files(dir=G:\folderpath\myfolder\*.sas7bdat,all_subdirs=Y)*/
@SASKiwi wrote:
@leoconnell - @PaigeMiller 's solution will only work for SAS libraries that are defined in the SAS session you are querying the dictionary table in. What about SAS datasets that are not currently defined in a SAS library but are still stored on disk somewhere?
Good point.
I had to do an analysis once on the volume of sas datasets our team managed and how many of those were compressed. I wrote a program that traversed the directory tree of our team's space on the sas grid. Basically, for each directory a libref was created. Then the corresponding dataset info was extracted from dictionary.tables and appended to a master dataset.
The following macro gets certain information about all of the sas datasets in a particular directory.
%macro get_dsn_info (dir);
libname _lib "&dir";
proc sql;
create table _tmp as
select "&dir" as dir length=200,
memname, modate, filesize, compress, pcompress, nobs, nvar
from dictionary.tables
where libname = '_LIB'
and memtype = 'DATA';
quit;
* append dataset info to master dataset;
proc append base=master data=_tmp;
run;
libname _lib clear;
%mend;
This is the main code that repeatedly calls the macro. I work in a Linux environment, but this could easily be adapted to Windows. in the code i offer a suggested command to get a list of directories.
%let hi_lvl_dir = /some/dir;
proc datasets lib=work nolist nowarn;
delete master;
quit;
* linux - get a list of all directories and subdirectories;
filename cmd pipe "find &hi_lvl_dir -not \( -path &hi_lvl_dir/.snapshot
-prune \) -type d 2>/dev/null";
* windows (?);
*filename cmd pipe "dir &hi_lvl_dir /ad /b /s";
* loop through all directories in directory tree;
data _null_;
infile cmd;
input dir &:$200.;
* repeatedly call the macro for each directory;
cmd = cats('%get_dsn_info(', dir, ');');
call execute (cmd);
run;
Works like a charm.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.