Hello,
I have created a dataset that holds the names of all the datasets which are stored in a unix library. the number of datasets in that location will increase daily.
This is the code that I have to create that master dataset:
ods output Members=Members;
proc datasets library=abc memtype=data;
run;
quit;
I also have a sample dataset:
63735723627 |
774648684 |
45648558585 |
85658676876 |
5865867568 |
What I need to do is check the library to see if any of the library datasets has any of the sample dataset values in there. But I'm clueless on how to even start querying the library datasets. How can I query all teh library datasets to look for sample dataset values?
Appreciate everyone's help and time and helping me figure this out.
Thank you.
You want to inspect ALL the dataset in libname ABC to see if any of the datasets have variable ACC with a value you have in your work.sample dataset.
Now if your are lucky enough such that all the datasets in ABC have names beginning with D or T, you can do this:
data mysample;
input acc;
datalines;
63735723627
774648684
45648558585
85658676876
5865867568
run;
data acc_found;
if _n_=1 then do;
declare hash h (dataset:'mysample');
h.definekey('acc');
h.definedone();
end;
set abc.d: abc.t: indsname=indsn;
if h.find()=0;
dsname=indsn;
run;
which will generate a dataset with two variables: ACC and DSNAME. It will have all the found ACC values, and the data set name(s) in which it was found.
Do you at least have a variable name that you expect the value(s) to be in?
Do you know if the values are character or numeric?
Do any of the data sets have the same structure? Perhaps they should be appended so that the search involves fewer sets.
Do you expect this to finish in any given time frame?
Show what you expect the output for this exercise to look like.
And SAS may not be the best tool for this.
@AJ_Brien wrote:
Thank you for your reply.
I do know the name of that variable, it's of num type. All the datasets have the same structure too. They are pretty huge in size and number that's the reason it was adviced not to append them.
Just how big is "huge"? People have quite different standards and one person's "huge" is another's "miniscule".
If the sets are truly largish, such as terabyte sized then it is going to take a lot of time to search regardless.
If there is only one or two variables that you are going to look for them perhaps you might want to investigate indexing the data sets on those variables. That can often improve some sorts of data retrieval.
If the need is to search among arbitrary variables then be prepared to wait. A lot.
And you haven't indicated what you expect the output to look like. What you actually need can seriously effect the approach. Do you need a list of data sets that have any of the values, a list of data sets by value, a count of the number of times each value appears in each set, a list of all data sets with the names of all of the variables that may have any of the values ...
You want to inspect ALL the dataset in libname ABC to see if any of the datasets have variable ACC with a value you have in your work.sample dataset.
Now if your are lucky enough such that all the datasets in ABC have names beginning with D or T, you can do this:
data mysample;
input acc;
datalines;
63735723627
774648684
45648558585
85658676876
5865867568
run;
data acc_found;
if _n_=1 then do;
declare hash h (dataset:'mysample');
h.definekey('acc');
h.definedone();
end;
set abc.d: abc.t: indsname=indsn;
if h.find()=0;
dsname=indsn;
run;
which will generate a dataset with two variables: ACC and DSNAME. It will have all the found ACC values, and the data set name(s) in which it was found.
Hello,
Did you try dictionary tables, you can find lot metadata information. I would query dictionary.tables if I were you.
proc sql;
select * from dictionary.tables
where libname="SASHELP";
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.