DATA Step, Macro, Functions and more

Loop through thousands of SAS datasets for metadata

Accepted Solution Solved
Reply
Contributor
Posts: 55
Accepted Solution

Loop through thousands of SAS datasets for metadata

Hi,
I'm in a situation that I've thousands of SAS datasets scattered all over both the LAN and the SAS Server.
These datasets are not necessarily in SAS libraries.
Is there a way to get their metadata in a loop ?
In a huge table I need information per dataset such as variable, type, length, varnum, format
Thanks !
Regards
B.


Accepted Solutions
Solution
‎02-01-2018 04:26 AM
Super User
Posts: 9,903

Re: Loop through thousands of SAS datasets for metadata

Posted in reply to Billybob73

On UNIX, I'd run

find / -name \*.sas7bdat > /tmp/dataset.list

From the list file, read the physical filenames. Use string functions to extract the pathname and the dataset name.

From dictionary.libnames (SQL) get the physical path names and logical libnames.

Merge those datasets by physical path.

Create a macro that retrieves the information you want (for a single dataset):

%macro get_meta(libname,memname,physical);
%if "&physical" ne ""
%then %do;
  libname &libname "&physical";
%end;

proc contents
  data=&libname..&memname
  out=temp
  noprint
;
run;

proc append data=temp base=target.mymetadata;
run;

%if "&physical" ne ""
%then %do;
  libname &libname clear;
%end;

%mend;

Run that macro off your dataset of datasets with call execute:

data _null_;
set mydatasets;
/* insert a where condition to exclude predefined libraries like WORK, SASUSER, SASHELP etc */
if libname = ""
then call execute('%nrstr(%get_meta)(__TEMP,' !! trim(memname) !! ',' !! trim(physical_path) !! ');');
else call execute('%nrstr(%get_meta)(' !! trim(libname) !! ',' !! trim(memname) !! ');');
run;

Be aware that you need to have the necessary permissions; it's best to run the find as superuser, so you get no error messages for inaccessible directories that are not of interest anyway (like /etc/security, where you will not find any SAS datasets). The same goes for all directories where datasets are found.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Posts: 23,296

Re: Loop through thousands of SAS datasets for metadata

Posted in reply to Billybob73

You can assign a library, then you can grab the data from SASHELP.VCOLUMN. 

 

I would probably first check my library locations, scan my LAN for SAS7BDAT files using an OS approach (or see SAS Macro Appendix for a macro that does this). Check if any folders don't match your libraries from (SASHELP.VLIBNAM) and assign those libraries. Once that's done you can pull all your data from SASHELP.VCOLUMN. 

Super User
Posts: 13,321

Re: Loop through thousands of SAS datasets for metadata

Posted in reply to Billybob73

For added fun are any of these sets in older SAS formats such as SSD or SD2 extensions?

Solution
‎02-01-2018 04:26 AM
Super User
Posts: 9,903

Re: Loop through thousands of SAS datasets for metadata

Posted in reply to Billybob73

On UNIX, I'd run

find / -name \*.sas7bdat > /tmp/dataset.list

From the list file, read the physical filenames. Use string functions to extract the pathname and the dataset name.

From dictionary.libnames (SQL) get the physical path names and logical libnames.

Merge those datasets by physical path.

Create a macro that retrieves the information you want (for a single dataset):

%macro get_meta(libname,memname,physical);
%if "&physical" ne ""
%then %do;
  libname &libname "&physical";
%end;

proc contents
  data=&libname..&memname
  out=temp
  noprint
;
run;

proc append data=temp base=target.mymetadata;
run;

%if "&physical" ne ""
%then %do;
  libname &libname clear;
%end;

%mend;

Run that macro off your dataset of datasets with call execute:

data _null_;
set mydatasets;
/* insert a where condition to exclude predefined libraries like WORK, SASUSER, SASHELP etc */
if libname = ""
then call execute('%nrstr(%get_meta)(__TEMP,' !! trim(memname) !! ',' !! trim(physical_path) !! ');');
else call execute('%nrstr(%get_meta)(' !! trim(libname) !! ',' !! trim(memname) !! ');');
run;

Be aware that you need to have the necessary permissions; it's best to run the find as superuser, so you get no error messages for inaccessible directories that are not of interest anyway (like /etc/security, where you will not find any SAS datasets). The same goes for all directories where datasets are found.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 55

Re: Loop through thousands of SAS datasets for metadata

Posted in reply to KurtBremser

Hi Kurt,

This is the way to do it ! Thanks !!

Rgds

B

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 111 views
  • 3 likes
  • 4 in conversation