BookmarkSubscribeRSS Feed
qkaiwei
Calcite | Level 5

I create a dataset named T with an index date,

data t(index=(date));

  do i=1 to 1000000;

  date=intnx('day','01jan2010'd,i);

  x=1;y=2222222;z=33333;

  output;

  end;

  format date yymmdd10.;

run;

The following two pieces of codes are to delete obs efficiently using indexes, but after submitting, I find the size of physicial file of table T don't change(perhaps called logically delete).

The reason why I use the following is that they do not create a new copy of the data set and delete indexes, save time of re-creating, but if logically delete, the table size will become bigger an bigger, and the IO time will increase rapidly.

How to balance?

data t;

  modify t;

  if date<='01feb3050'd then remove t;

run;

proc sql;

  delete from t where date<='01feb3050'd;

quit;

1 REPLY 1
LinusH
Tourmaline | Level 20

Yes, the deletes as logical.

If you wish to save space, use the REUSE= data set option (only valid with COMPRESS=YES for some reason...?).

Be aware that the table can be fragmented and less efficient for querying.

Data never sleeps

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1045 views
  • 0 likes
  • 2 in conversation