Help using Base SAS procedures

Get a list of dataset names in a directory/library

Accepted Solution Solved
Reply
Super Contributor
Posts: 398
Accepted Solution

Get a list of dataset names in a directory/library

This post is related to https://communities.sas.com/t5/SAS-Procedures/Get-a-list-of-dataset-names-in-a-directory-library/td-..., but that post is solved and locked.

 

SAS 9.3

 

Run this:

 

 

proc datasets lib=work kill nowarn nolist;
quit;

/*data one;x=1;run;*/

ods results off;
ods output Members=Members (keep=name);
proc datasets lib=work memtype=data;
run;
quit;
ods output close;
ods results on;

proc print data=members;
run;

 

You'll get this log:

 

29         ods results off;
30         ods output Members=Members (keep=name);
31         proc datasets lib=work memtype=data;
WARNING: No matching members in directory.
32         run;

33         quit;

WARNING: Output 'Members' was not created.  Make sure that the output object name, label, or path is spelled correctly. 
          Also, verify that the appropriate procedure options are used to produce the requested output object.  For 
         example, verify that the NOPRINT option is not used.
NOTE: PROCEDURE DATASETS used (Total process time):
      real time           0.06 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              519.78k
      OS Memory           16676.00k
      Timestamp           07/11/2018 04:58:35 PM
      

34         ods output close;
35         ods results on;
36         
37         proc print data=members;
ERROR: File WORK.MEMBERS.DATA does not exist.
38         run;

If you uncomment the data step, it works.

 

Is there any way to get this to create an empty members dataset when the library is empty, without issuing warnings?

 

Also, as an FYI, it also fails if the nolist option is specified for proc datasets, which I guess makes sense.

 

I know about dictionary.tables but I have a lot of RBDMS libraries allocated and using dictionary.tables is slow.

 

 


Accepted Solutions
Solution
Monday
Super Contributor
Posts: 398

Re: Get a list of dataset names in a directory/library

[ Edited ]
Posted in reply to KurtBremser

Thanks @KurtBremser.

 

I've had a chance to play with this a bit more.

 

First of all, I'm on SAS 9.3 and using SAS EG 6.1.  Also, the code I inherited was using sashelp.vcolumn, and was what I was using when I first posted.  Finally, the purpose of this is error checking code to see if an RBDMS (SQL Server) library has tables that already exist in the library before SAS tries to recreate them (which will cause an error).

 

Here is some test code:

 

* set desired SAS options ;
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
options dbidirectexec;
options msglevel=I;
options fullstimer;
options mprint mrecall;
options compress=binary;
options ls=max;
options nocenter;
options lognumberformat=1;

* NOTE I have a number of RBDMS (SQL Server) libraries assigned ;
* some via direct ODBC libnames and some via metadata server ODBC libraries ;
* one has particularly slow connectivity ;

%libname_sqlsvr(libref=TMP,server=MYSERVER,database=MYDB,schema=tmp)

* a bunch of ODBC libraries allocated via the metadata engine ;
libname foo meta library="foo";
libname bar meta library="bar";

* empty dataset ;
data empty;
stop;
run;

* empty library ;
libname empty '.';

* columns ;
* sashelp view ;
%bench(start)
data cols1;
   set sashelp.vcolumn;
   where libname="SASHELP" and memname="DEMOGRAPHICS";
run;
%bench(end)

* dictionary table ;
%bench(start)
proc sql;
   create table cols2 as
   select * from dictionary.columns
   where libname="SASHELP" and memname="DEMOGRAPHICS";
quit;
%bench(end)

* proc contents (not empty) ;
%bench(start)
proc contents data=sashelp.demographics out=cols3 noprint;
run;
%bench(end)

* proc contents (empty) ;
%bench(start)
proc contents data=work.empty out=cols4 noprint;
run;
%bench(end)

* tables ;
* sashelp view ;
%bench(start)
data tables1;
   set sashelp.vtable;
   where libname="SASHELP" and memtype="DATA";
run;
%bench(end)

* dictionary table ;
%bench(start)
proc sql;
   create table tables2 as
   select * from dictionary.tables
   where libname="SASHELP" and memtype="DATA";
quit;
%bench(end)

* proc datasets (not empty) (wrong results);
%bench(start)
ods results off;
ods output Members=tables3;
proc datasets lib=sashelp memtype=data;
run;
quit;
ods output close;
ods results on;
%bench(end)

* proc datasets (not empty) (correct results);
%bench(start)
ods results off;
ods output Members=tables4 (where=(memtype="DATA"));
proc datasets lib=sashelp memtype=data;
run;
quit;
ods output close;
ods results on;
%bench(end)

* proc datasets (empty) (barfs);
%bench(start)
ods results off;
ods output Members=tables5;
proc datasets lib=empty;
run;
quit;
ods output close;
ods results on;
%bench(end)

 

 

The macros are located in https://github.com/scottbass/SAS/tree/master/Macro.

 

And my results:

 

For columns:

 

1) Using sashelp.vcolumn is horribly slow, and generates thousands of lines in the SAS log such as:

 

ODBC: AUTOCOMMIT is NO for connection 11
 
ODBC_3075: Prepared: on connection 11
SELECT * FROM "dbo"."DEMOGRAPHIC"
 
ODBC: AUTOCOMMIT is NO for connection 11
 
ODBC_3077: Prepared: on connection 11
SELECT * FROM "dbo"."MASTER_HISTORY"
 
ODBC: ROLLBACK performed on connection 11.
ODBC: AUTOCOMMIT is NO for connection 11
 
ODBC_3078: Prepared: on connection 11
SELECT * FROM "dbo"."OPERATIONAL"

before finally failing because one of the metadata libraries is out of sync with the RBDMS schema:

 

ERROR: Incorrect length in SAS Metadata Repository for column recid.
ERROR: SQL View SASHELP.VCOLUMN could not be processed.
NOTE: The DATA step has been abnormally terminated.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.COLS1 may be incomplete.  When this step was stopped there were 0 observations and 18 variables.

 

2) I got correct results and good performance for dictionary.columns and proc contents (non-empty and empty).  The best results (by a tiny margin) is proc contents.

 

For tables:

 

3) Using sashelp.vtable is even more horribly slow, and generates even more thousands of lines (8000+) in the SAS log such as those above, but at least didn't fail, and generated 128 SAS dataset names from SASHELP.

 

4) Use dictionary.tables was very fast, did not pollute the SAS log, did not fail, and also generates 128 SAS dataset names from SASHELP.

 

5) Using proc datasets + ODS:

 

tables3 is not the correct results, as it returns 158 records, and includes both datasets and indexes, even though memtype=data was specified for proc datasets.

 

tables4 is the correct results, but requires an additional output dataset where statement.

 

tables5 generates a warning, and does not create the correct results, in that it does not create an output dataset.

 

My conclusions (feel free to disagree, but these are my conclusions):

 

1) The sashelp.vtable and sashelp.vcolumn views are buggy, or at least undesireable, in that they are slow, pollute the SAS log when sastrace and/or msglevel=I is on, and fail if there is a problem with the metadata when the meta engine is used.

 

2) My preference will be to use proc sql + dictionary.tables and dictionary.columns.

 

3) For column metadata, proc contents is a good alternative.

 

4) The proc datasets + ods combination is klunky and prone to error.  As far as I'm concerned, it's a kludge to overcome a shortcoming in the product.  It also returns much less useful metadata than the dictionary tables.

 

5) Finally, I would think after 40+ years, someone in R&D, QA, or Product Management would compare the functionality of PROC DATASETS vs. PROC CONTENTS and conclude:

 

A) Hmmm, there's some symmetry here between these two procedures...

B) PROC CONTENTS allows the end user to specify a SAS dataset (or view), allows the user to specify an output dataset (or libref._all_), and that output dataset contains metadata about the SAS dataset (specifically column metadata).

C) PROC DATASETS allows the end user to specify a SAS library, does NOT allow the end user to specify an output dataset (at least for member names), and using ODS output is klunky, hard to remember, not symmetrical with PROC CONTENTS, and often doesn't generate the correct results.

D) Hey, let's make these two procedures as consistent as possible.  And while we're at it, let's make NOPRINT an alias for NOLIST, to make it consistent with PROC CONTENTS, instead of having two different options that do the same thing.

 

But that's just me.

 

But at least I have a workaround for the performance problems with sashelp.vcolumn and sashelp.vtable by using the dictionary tables instead.

View solution in original post


All Replies
Super User
Posts: 10,278

Re: Get a list of dataset names in a directory/library

Posted in reply to ScottBass

Try this, without the ODS detour:

proc datasets library=work kill noprint;
quit;

proc datasets library=work noprint;
contents data=_all_ memtype=data out=members (keep=memname rename=(memname=name));
run;
quit;

proc sort data=members nodupkey;
by name;
run;

The sort nodupkey is needed once there are datasets present, as the contents statement delivers one observation per variable in the dataset.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Solution
Monday
Super Contributor
Posts: 398

Re: Get a list of dataset names in a directory/library

[ Edited ]
Posted in reply to KurtBremser

Thanks @KurtBremser.

 

I've had a chance to play with this a bit more.

 

First of all, I'm on SAS 9.3 and using SAS EG 6.1.  Also, the code I inherited was using sashelp.vcolumn, and was what I was using when I first posted.  Finally, the purpose of this is error checking code to see if an RBDMS (SQL Server) library has tables that already exist in the library before SAS tries to recreate them (which will cause an error).

 

Here is some test code:

 

* set desired SAS options ;
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
options dbidirectexec;
options msglevel=I;
options fullstimer;
options mprint mrecall;
options compress=binary;
options ls=max;
options nocenter;
options lognumberformat=1;

* NOTE I have a number of RBDMS (SQL Server) libraries assigned ;
* some via direct ODBC libnames and some via metadata server ODBC libraries ;
* one has particularly slow connectivity ;

%libname_sqlsvr(libref=TMP,server=MYSERVER,database=MYDB,schema=tmp)

* a bunch of ODBC libraries allocated via the metadata engine ;
libname foo meta library="foo";
libname bar meta library="bar";

* empty dataset ;
data empty;
stop;
run;

* empty library ;
libname empty '.';

* columns ;
* sashelp view ;
%bench(start)
data cols1;
   set sashelp.vcolumn;
   where libname="SASHELP" and memname="DEMOGRAPHICS";
run;
%bench(end)

* dictionary table ;
%bench(start)
proc sql;
   create table cols2 as
   select * from dictionary.columns
   where libname="SASHELP" and memname="DEMOGRAPHICS";
quit;
%bench(end)

* proc contents (not empty) ;
%bench(start)
proc contents data=sashelp.demographics out=cols3 noprint;
run;
%bench(end)

* proc contents (empty) ;
%bench(start)
proc contents data=work.empty out=cols4 noprint;
run;
%bench(end)

* tables ;
* sashelp view ;
%bench(start)
data tables1;
   set sashelp.vtable;
   where libname="SASHELP" and memtype="DATA";
run;
%bench(end)

* dictionary table ;
%bench(start)
proc sql;
   create table tables2 as
   select * from dictionary.tables
   where libname="SASHELP" and memtype="DATA";
quit;
%bench(end)

* proc datasets (not empty) (wrong results);
%bench(start)
ods results off;
ods output Members=tables3;
proc datasets lib=sashelp memtype=data;
run;
quit;
ods output close;
ods results on;
%bench(end)

* proc datasets (not empty) (correct results);
%bench(start)
ods results off;
ods output Members=tables4 (where=(memtype="DATA"));
proc datasets lib=sashelp memtype=data;
run;
quit;
ods output close;
ods results on;
%bench(end)

* proc datasets (empty) (barfs);
%bench(start)
ods results off;
ods output Members=tables5;
proc datasets lib=empty;
run;
quit;
ods output close;
ods results on;
%bench(end)

 

 

The macros are located in https://github.com/scottbass/SAS/tree/master/Macro.

 

And my results:

 

For columns:

 

1) Using sashelp.vcolumn is horribly slow, and generates thousands of lines in the SAS log such as:

 

ODBC: AUTOCOMMIT is NO for connection 11
 
ODBC_3075: Prepared: on connection 11
SELECT * FROM "dbo"."DEMOGRAPHIC"
 
ODBC: AUTOCOMMIT is NO for connection 11
 
ODBC_3077: Prepared: on connection 11
SELECT * FROM "dbo"."MASTER_HISTORY"
 
ODBC: ROLLBACK performed on connection 11.
ODBC: AUTOCOMMIT is NO for connection 11
 
ODBC_3078: Prepared: on connection 11
SELECT * FROM "dbo"."OPERATIONAL"

before finally failing because one of the metadata libraries is out of sync with the RBDMS schema:

 

ERROR: Incorrect length in SAS Metadata Repository for column recid.
ERROR: SQL View SASHELP.VCOLUMN could not be processed.
NOTE: The DATA step has been abnormally terminated.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.COLS1 may be incomplete.  When this step was stopped there were 0 observations and 18 variables.

 

2) I got correct results and good performance for dictionary.columns and proc contents (non-empty and empty).  The best results (by a tiny margin) is proc contents.

 

For tables:

 

3) Using sashelp.vtable is even more horribly slow, and generates even more thousands of lines (8000+) in the SAS log such as those above, but at least didn't fail, and generated 128 SAS dataset names from SASHELP.

 

4) Use dictionary.tables was very fast, did not pollute the SAS log, did not fail, and also generates 128 SAS dataset names from SASHELP.

 

5) Using proc datasets + ODS:

 

tables3 is not the correct results, as it returns 158 records, and includes both datasets and indexes, even though memtype=data was specified for proc datasets.

 

tables4 is the correct results, but requires an additional output dataset where statement.

 

tables5 generates a warning, and does not create the correct results, in that it does not create an output dataset.

 

My conclusions (feel free to disagree, but these are my conclusions):

 

1) The sashelp.vtable and sashelp.vcolumn views are buggy, or at least undesireable, in that they are slow, pollute the SAS log when sastrace and/or msglevel=I is on, and fail if there is a problem with the metadata when the meta engine is used.

 

2) My preference will be to use proc sql + dictionary.tables and dictionary.columns.

 

3) For column metadata, proc contents is a good alternative.

 

4) The proc datasets + ods combination is klunky and prone to error.  As far as I'm concerned, it's a kludge to overcome a shortcoming in the product.  It also returns much less useful metadata than the dictionary tables.

 

5) Finally, I would think after 40+ years, someone in R&D, QA, or Product Management would compare the functionality of PROC DATASETS vs. PROC CONTENTS and conclude:

 

A) Hmmm, there's some symmetry here between these two procedures...

B) PROC CONTENTS allows the end user to specify a SAS dataset (or view), allows the user to specify an output dataset (or libref._all_), and that output dataset contains metadata about the SAS dataset (specifically column metadata).

C) PROC DATASETS allows the end user to specify a SAS library, does NOT allow the end user to specify an output dataset (at least for member names), and using ODS output is klunky, hard to remember, not symmetrical with PROC CONTENTS, and often doesn't generate the correct results.

D) Hey, let's make these two procedures as consistent as possible.  And while we're at it, let's make NOPRINT an alias for NOLIST, to make it consistent with PROC CONTENTS, instead of having two different options that do the same thing.

 

But that's just me.

 

But at least I have a workaround for the performance problems with sashelp.vcolumn and sashelp.vtable by using the dictionary tables instead.

Super User
Posts: 10,278

Re: Get a list of dataset names in a directory/library

Posted in reply to ScottBass

sashelp.vtable is a SQL view of dictionary tables. I have observed that in some situations, that view does not hand over the where condition to the SQL optimizer, and so the whole dictionary table is read to feed the view. That is why I always prefer to use the dictionary table in SQL.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 99 views
  • 2 likes
  • 2 in conversation