I use following code to delete datasets if its observation less than 30, but how could I change the criterion? For exampe, deleting dataset if stock_price (one variable in all datasets) less than 15.
proc sql noprint; select memname into : dellist separated by ' ' from dictionary.tables where libname= work and nobs<30; quit; proc datasets nolist; delete &dellist; quit;
It is easy for IML. data a; input id stock_price; cards; 1 23 2 35 ; run; data b; input cusid stock_price; cards; 2 214 2 34 4 76 ; run; data c; input cusid stock_price; cards; 2 14 2 34 4 3 ; run; proc iml; x=datasets('work'); dsn='work.'+datasets('work'); do i=1 to nrow(dsn); use (dsn[i]); read all var{stock_price}; close; if any(stock_price < 15) then call delete('work',x[i]); end; quit;
1. The code you show would not work. You need
where libname='WORK'
2. A slightly more compact way would be to use the DROP TABLE statement in PROC SQL:
proc sql noprint;
select distinct memname into :dsnames_with_few_obs separated by ','
from dictionary.tables where libname='WORK' and nobs<30 ;
drop table &dsnames_with_few_obs ;
quit;
3. I suppose some tricky progamming in a macro could do your new task in a single PROC SQL, but I recommend the following, which uses PROC SQL only to idenifiy candidates for later assessmment (I use variable AGE). Then a data step uses a hash object recycled over each candidate. Candidates not meeting the cutoff criterion would leave the hash object empty.. And the where clause and "OBS=1" clause tells the object to stop building as soon as a single observation has a value below the cut off - no need to go further to find the minimum value of the "successful" candidates:.
data class; set sashelp.class; run;
%let var=AGE;
%let cutoff=12;
proc sql noprint;
select distinct memname into :dsnames_with_selected_var separated by ' '
from dictionary.columns where libname='WORK' and upcase(name)="&var";
quit;
data _null_;
if 0 then set &dsnames_with_selected_var;
declare hash h;
do d=1 to &sqlobs;
length dsn $32;
dsn=scan("&dsnames_with_selected_var",d,' ');
h=_new_ hash (dataset:cats(dsn,'(obs=1 keep=&var where=(&var<&cutoff))'));
h.definekey('age');
h.definedone();
if h.num_items > 0 then call execute(cats("proc delete data=",dsn,';run;'));
h.clear();
end;
stop;
run;
.
Modified (new statements are underlined Removed statement has strike-through.
Regards,
Mark
You want delete that table if any one of stock price is less than 15 ? data a; input id stock_price; cards; 1 23 2 35 ; run; data b; input cusid stock_price; cards; 2 14 2 34 4 76 ; run; data c; input cusid stock_price; cards; 2 14 2 34 4 76 ; run; data _null_; set sashelp.vmember(where=(libname='WORK' and memtype='DATA')) end=last; if _n_=1 then call execute('data temp;length dsn dsname $ 60;set '); call execute(cats(libname,'.',memname,'(keep=stock_price)')); if last then call execute(' indsname=dsn; if stock_price lt 15 then do;dsname=dsn;output;end;run;'); run; proc sql noprint; select distinct dsname into : list separated by ',' from temp; drop table &list; quit;
It is easy for IML. data a; input id stock_price; cards; 1 23 2 35 ; run; data b; input cusid stock_price; cards; 2 214 2 34 4 76 ; run; data c; input cusid stock_price; cards; 2 14 2 34 4 3 ; run; proc iml; x=datasets('work'); dsn='work.'+datasets('work'); do i=1 to nrow(dsn); use (dsn[i]); read all var{stock_price}; close; if any(stock_price < 15) then call delete('work',x[i]); end; quit;
Deleting a data set is easy. Defining the problem is harder. What are you really trying to accomplish here?
(a) Delete a SAS data set if the average of STOCK_PRICE is less than 15?
(b) Delete a SAS data set if any value for STOCK_PRICE is less than 15?
(c) Search through an entire folder, and delete all data sets that meet criterion (a) or perhaps criterion (b)?
(d) Search through a bunch of folders and delete all data sets that meet criterion (a) or perhaps criterion (b)?
If you want the program to search through a folder (or through many folders), is it guaranteed that every data set in the folder(s) contains a numeric variable named STOCK_PRICE?
While the answers to these questions are important, the posts so far have at least answered (in general) one key question that you had. No, you cannot use your existing program. The existing program looks at the header information ... metadata about a SAS data set. Your new requirement needs to look within the data set to examine the data values.
Another symptomatic problem which has arisen by having many datasets for the same data. Advice, as always, put your data into one dataset, the problem is vastly simplefied then.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.