09-27-2012 12:03 PM
I have a SAS table, say A, with two columns with library names and sas datasets names
corresponding to some selected datasets on my SAS installation.
so, for instance, lib1.tab2 is a valid table name on my installation.
I wish to retrieve metadata for those tables using a proc sql similar to the one below :
SELECT A.LIB, A.TABLE, B.NAME, B.TYPE, B.LENGTH
LEFT JOIN dictionary.columns B
ON B.LIBNAME=strip(upcase(A.LIB)) and B.MEMNAME=strip(upcase(A.TABLE));
It works but it seems quite slow. Also, the log contains many messages such as
NOTE: Data file XXX.DATA is in a format native to another
host or the file encoding does not match the session encoding. Cross Environment Data
with XXX being data files names.
I understand the message since I work with windows SAS and some
data files have been created under AIX.
What I do not understand is that it is displayed for data files that
are not referred to in the table A.
It seems that this query systematically parses every dataset from every libname
before taking into account the condition in the join.
Is ther a way to optimize this query in order to filter the data sets
before reading the associated data ?
09-27-2012 12:08 PM
Use PROC CONTENTS or the CONTENTS statement of DATASETS.
As you can see dictionary.columns wants too gather meta data from "everywhere" and then filter it.
PROC CONTENTS will be much faster.
09-27-2012 04:16 PM
Thank you, I will do that.
It is strange though that when I use the following query with
a table that has been created under AIX
SELECT NAME, TYPE, LENGTH
WHERE LIBNAME="somelib" and MEMNAME="somedata";
I only get the previous message for the selected table.
It is like SAS prefilters the data only if a unique dataset is selected.