BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dapenDaniel
Obsidian | Level 7

Hello. I have 100 sas data sets (they are in the same folder) and all of them have a same variable called group. Is there any efficient way to delete this variable from all these data sets rather than deleting them one file by one file? Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

 

 

 

 

data a;
 set sashelp.class;
 id=1;
run;
data b;
 set sashelp.air;
 id=1;
run;






proc contents data=work._all_ out=temp varnum noprint;
run;
proc sql noprint;
select  name into : list separated by ','
 from temp
  group by name
   having count(*)>1;

create table want as
select  distinct memname
 from temp
  group by name
   having count(*)>1;
quit;
data _null_;
 set want end=last;
 if _n_=1 then call execute('proc sql; ');
 call execute(cat('alter table ',memname,''));
 call execute(cat('drop ',"&list",';'));
 if last then call execute(';quit;');
run;



View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

You could write a macro to do this across all datasets in a libname (folder).

 

Or you could use CALL EXECUTE to do the same.

 

I suppose that would be "efficient" compared to doing it one-by-one.

 

I am not at work and don't have SAS available to test any code right now.

 

But let me ask a very basic question: why do you need to delete this variable anyway? It isn't hurting anything if you keep it in there, the data sets are a little larger but other than that it doesn't matter. It could be that the simplest thing to do is to change your mindset and leave the variable in there.

--
Paige Miller
dapenDaniel
Obsidian | Level 7

Thanks for your advice. I am going to use a supercomputer to run these datasets and the space in supercomputer is limited so I would like to delete it. Moreover, I will give each observation a new number when I run datasets in supercomputer. 

PaigeMiller
Diamond | Level 26

Moreover, I will give each observation a new number when I run datasets in supercomputer. 

 

I'm afraid I don't understand this. It does not fit with the rest of the topic.

--
Paige Miller
Tom
Super User Tom
Super User

@dapenDaniel wrote:

Thanks for your advice. I am going to use a supercomputer to run these datasets and the space in supercomputer is limited so I would like to delete it. Moreover, I will give each observation a new number when I run datasets in supercomputer. 


Just don't pass the variable over.  How are you moving the SAS datasets to the SuperComputer?

dapenDaniel
Obsidian | Level 7

I can remotely access to the supercomputer.

Tom
Super User Tom
Super User

@dapenDaniel wrote:

I can remotely access to the supercomputer.


I would assume so. They probably keep the room where the computer is very cold.

 

Are you running SAS on the supercomputer?  If not then what? Some type of database? Some other language?

Are you connecting to the supercomputer using a SAS program running on your normal computer?

 

How are you MOVING the data files to the supercomputer? 

If the files are already on the supercomputer than just tell it to ignore that variable.

 

Are the files actual SAS datasets or some other format, like delimited text files?

 

For example if your supercomputer is running SAS and your data sets are actual SAS datasets then use the DROP= dataset option.

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

If it is a supper computer or a server, are you sure that other process's or user are not using the variable you are wanting to drop?

Don't create issues for yourself or others just by dropping a stuff.

 

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

read all of the directory dataset members into a file.

use that file to call each of the datasets in datastep and include  drop statement that would drop group in the output dataset.

PaigeMiller
Diamond | Level 26

@VDD wrote:

read all of the directory dataset members into a file.

use that file to call each of the datasets in datastep and include  drop statement that would drop group in the output dataset.


I'd like to recommend that instead of using a data step to do this (which causes SAS to read every record and could be slow if the datasets are large), that you use PROC DATASETS to delete the variable.

--
Paige Miller
dapenDaniel
Obsidian | Level 7

Thanks. I will look up that code.

Ksharp
Super User

 

 

 

 

data a;
 set sashelp.class;
 id=1;
run;
data b;
 set sashelp.air;
 id=1;
run;






proc contents data=work._all_ out=temp varnum noprint;
run;
proc sql noprint;
select  name into : list separated by ','
 from temp
  group by name
   having count(*)>1;

create table want as
select  distinct memname
 from temp
  group by name
   having count(*)>1;
quit;
data _null_;
 set want end=last;
 if _n_=1 then call execute('proc sql; ');
 call execute(cat('alter table ',memname,''));
 call execute(cat('drop ',"&list",';'));
 if last then call execute(';quit;');
run;



Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1808 views
  • 2 likes
  • 5 in conversation