DATA Step, Macro, Functions and more

"Set" all datasets in a particular library

Accepted Solution Solved
Reply
Regular Contributor
Posts: 212
Accepted Solution

"Set" all datasets in a particular library

[ Edited ]

Here's my code:

 

 

libname i_50401 "c:\0_sas_1\i_50401";

data sas_1.i_50401;
set
i_50401._all_
;
run
;

 

All the datasets in the library have the same variables, etc.

 

Error from SAS:

 

ERROR: File I_50401._ALL_.DATA does not exist. 

 

Can I not use _all_ in the above way??

 

I thought I could avoid having to list all the files (40,000).

 

Please advise.

 

Thanks,

Nicholas Kormanik

 

 


Accepted Solutions
Solution
‎04-28-2017 04:09 AM
Super User
Posts: 6,928

Re: "Set" all datasets in a particular library

If those datasets all begin with a certain letter, you could use (eg)

set i_50401.x:;

Otherwise you could do 27 steps (a-z and _), and finally concatenate all 27 intermediate results into one.

 

Keep in mind that the : wildcard for datasets was added with SAS 9.2, so it won't work with older versions.

 

Although I have to question which logic got you to a point where you have 40.000 identicallly structured datasets.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Solution
‎04-28-2017 04:09 AM
Super User
Posts: 6,928

Re: "Set" all datasets in a particular library

If those datasets all begin with a certain letter, you could use (eg)

set i_50401.x:;

Otherwise you could do 27 steps (a-z and _), and finally concatenate all 27 intermediate results into one.

 

Keep in mind that the : wildcard for datasets was added with SAS 9.2, so it won't work with older versions.

 

Although I have to question which logic got you to a point where you have 40.000 identicallly structured datasets.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 212

Re: "Set" all datasets in a particular library

Phew!  Appears to be working.  I forgot about that way of indicating wildcard.

 

Thanks much!

 

Just to clarify, though, in case others want to know, cannot _ALL_ be used in the case at hand??

 

 

Super User
Posts: 6,928

Re: "Set" all datasets in a particular library


NicholasKormanik wrote:

Phew!  Appears to be working.  I forgot about that way of indicating wildcard.

 

Thanks much!

 

Just to clarify, though, in case others want to know, cannot _ALL_ be used in the case at hand??

 

 


No, _all_ does not work in a set statement. Your situation is a very unusual one, to say the least. Typically you have a very finite number of datasets that need concatenating, and it is expected that those will also have a common prefix if one needs to automate that.

How did you arrive at 40.000 datasets, if I masy ask?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 212

Re: "Set" all datasets in a particular library

No doubt I'm going about this particular problem solving in a pretty inefficient way.  Thank goodness for tolerant computer and software.

 

If I were to hire you for a consultation, you certainly would set me on a better path.

 

Included in each file is a "Mean" which I need.  That's one of the variables.  Arrived at through Proc Univariate.

 

I need to find which "Means" are highest, and which are lowest, of all the files.

 

There is no direct way to ask SAS: Among these 40,000 files, which have the largest "Mean"?  Which the smallest?

 

Instead I must concatenate into one dataset.  Then sort.

 

But, that approach is probably inefficient as well.

 

Good grief.

 

Super User
Posts: 6,928

Re: "Set" all datasets in a particular library

I'd look strongly if the "40.000 files" issue could not be solved by using by-group processing first. So that you have one dataset with 40.000 results, instead of the other way round.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 212

Re: "Set" all datasets in a particular library

But... if the result is the same???

 

One large concatenated dataset.

 

I'm miffed that SAS does not state exactly how many datasets it concatenated in the run -- i.e., 40,000.

 

Duh.

 

One just has to assume that SAS got 'em all.

 

Sad.

 

 

Super User
Posts: 6,928

Re: "Set" all datasets in a particular library

That's because 40.000 similar datasets in one library is such an inefficient thing that nobody at SAS would even contemplate that. For a reason.

40.000 members in a directory file is something that starts to affect the performance of operating systems, even at the heavy-duty level (think z/OS, AIX or similar).

And you do have information in the log:

data class1 class2 class3;
set sashelp.class;
run;

data final;
set work.c:;
run;

produces this log:

24         data class1 class2 class3;
25         set sashelp.class;
26         run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.CLASS1 has 19 observations and 5 variables.
NOTE: The data set WORK.CLASS2 has 19 observations and 5 variables.
NOTE: The data set WORK.CLASS3 has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
      

27         
28         data final;
29         set work.c:;
30         run;

NOTE: There were 19 observations read from the data set WORK.CLASS1.
NOTE: There were 19 observations read from the data set WORK.CLASS2.
NOTE: There were 19 observations read from the data set WORK.CLASS3.
NOTE: The data set WORK.FINAL has 57 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

So you have a listing of all contributing datasets in the log. To make sure, save the log to a file, extract the "NOTE: There were ..." lines from it and compare that to a directory listing.

But you should start at square one and check if using by in the proc univariate would not solve your problem in an easier way.

Or at least run the procedure repeatedly in a macro (or from a dataset with call execute) and have it append the (temporary) results to a common output dataset.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Regular Contributor
Posts: 212

Re: "Set" all datasets in a particular library

The log gives the number of observations, and variables.

 

But not the number of imported files.  Surprisingly.

 

One has to examine the log after the fact, in some text edit program, to see which files may have been excluded by SAS.  Likely an exhausting process.

 

Super User
Super User
Posts: 6,498

Re: "Set" all datasets in a particular library

You can add code to count the dataset, but only those that have observations.

data class1 class2 class3(where=(age=0));
set sashelp.class;
run;

data final;
  length dsn $41 ;
  if eof then put 'NOTE: ' n_nonzero :comma12. 'datasets contributed observations.';
  set work.class: indsname=dsn end=eof;
  n_nonzero + dsn^=lag(dsn);
run;
NOTE: 2 datasets contributed observations.
NOTE: There were 19 observations read from the data set WORK.CLASS1.
NOTE: There were 19 observations read from the data set WORK.CLASS2.
NOTE: There were 0 observations read from the data set WORK.CLASS3.
NOTE: The data set WORK.FINAL has 38 observations and 6 variables.
Regular Contributor
Posts: 212

Re: "Set" all datasets in a particular library

Seems super cool.  At least potentially.  Not following exactly, though.

 

Here's my simple program.  Please edit to add your suggestion:

 

data sas_1.i_50501; 
set i_50501.i:; 
run;

Thanks a zillion.

 

 

Respected Advisor
Posts: 3,777

Re: "Set" all datasets in a particular library


NicholasKormanik wrote:

But... if the result is the same???

 

One large concatenated dataset.

 

I'm miffed that SAS does not state exactly how many datasets it concatenated in the run -- i.e., 40,000.

 

Duh.

 

One just has to assume that SAS got 'em all.

 

Sad.

 

 


If you are going to concatenate 40,000 data sets I'm pretty sure you will need to use OPEN=DEFER.

 

data cl1 cl2 cl3;
   set sashelp.class;
   run;
data cl4 cl5 cl6;
   set sashelp.class;
   keep weight;
   run;
data weight;
   set cl:(keep=weight) open=defer;
   run;
Super User
Super User
Posts: 6,498

Re: "Set" all datasets in a particular library

SAS did not implement the use of _ALL_ for specifying a list for members.

I suspect it is because it could potentially cause confusion for existing program since _ALL_ is a valid member name.

data _all_;
 x=1;
run;

Of course if you did make a member with that Name it would cause trouble with PROC CONTENTS since data=libref._all_ has been a valid syntax for PROC CONTENTS for a long time.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 257 views
  • 1 like
  • 4 in conversation