Extracting the rows and writing them back to the same table (really a SAS data set, I assume) won't help you because in order to make the file smaller it must be recreated using a SAS DATA step, PROC APPEND, etc. This removes the unused space in the file. If you recreate the file using COMPRESS=yes and REUSE=yes then space will be reused. But as Linus points out, it is a good idea to periodically recreate the file to clean the space up.
The referential integrity can be dropped while maintenance is performed on the table. Once the work is done, the referential integrity can be reapplied. Keep in mind, if the data violates the constrains it will cause an error when reapplied.
I have been playing around with some code that shows how to make a file smaller. Hopefully, it will help somebody. I plan to take it and create a communities article on it at some point. In DI Studio you can use the APPEND transform to do a lot of this. I assume there is a way to rename a SAS data set; I don't know which transform will do that.
libname prod "C:\sasdata\prod";
libname stage "C:\sasdata\stage";
/* Create a 1GB SAS data set */
data prod.slackspace;
do i = 1 to 10000000;
x='abcdefghijklmnopqrstuvwxyz0123456789';
y='abcdefghijklmnopqrstuvwxyz0123456789';
z='abcdefghijklmnopqrstuvwxyz0123456789';
output;
end;
run;
/* DELETE all the observations from the 1GB SAS data set to see if it gets smaller */
proc sql;
delete from prod.slackspace;
select count(*) from prod.slackspace;
quit;
/* At this point the select count(*) returns 0 */ /* The file is still 1GB */
/* CREATE TABLE AS to see if the new SAS data set is smaller - it will be 128K */
proc sql;
create table stage.noslack_sql as select * from prod.slackspace;
run;
/* Create new table using PROC APPEND to see if the new SAS data set is smaller */
proc append base=stage.noslack_append data=prod.slackspace;
run;
/* Create new table using a DATA step to see if the new SAS data set is smaller */
data stage.noslack_datastep;
set prod.slackspace;
run;
/* Verify the sizes */
proc datasets lib=prod;
run;
proc datasets lib=stage;
run;
/* For production... */
/* Remove wasted space */
/* Use two libraries pointing to different devices for performance.*/ /* This prevents simultaneous reads and writes to the same device. */ /* Ignore the fact that I am on a laptop;)
proc append base=stage.slackspace_small data=prod.slackspace;
run;
/* move smaller file back into production - give it the original name */
proc datasets lib=prod;
change slackspace=slackspace_old;
copy in=stage out=prod move;
change slackspace_small=slackspace;
run;
Best wishes,
Jeff
... View more