BookmarkSubscribeRSS Feed
juanvg1972
Pyrite | Level 9

Hi,

I am trying to make a delete in the most quickest way.

I have a delete with a simple where in which I delete 50% of the table.

I have testes two ways:

1)

data ventas_tot;

  set ventas_tot;

  if base = 'N' then delete;

  run;

2)

proc sql;

delete from ventas_tot where base = 'N';

quit;

The 1) way is the faster, but os there any other way yo make a delete as fast as possible.

I don't hace any index in the table

Thanks

5 REPLIES 5
Patrick
Opal | Level 21

With SAS tables your option 1) doesn't actually delete the table but it re-creates it with records where your "delete" condition isn't met. You could also write this like:

data ventas_tot;

     set ventas_tot(where=(base ne 'N'));

run;

Your option 2) is an actual delete. The catch here: It doesn't really physically delete the records but only marks them as deleted (logical delete). This process is not only slower it also doesn't reduce the physical storage space of your table. So option 1) is "better".

ods listing;

data test;

  set sashelp.class;

run;

proc sql;

  delete from test where name='Alfred';

quit;

proc contents data=test;

run;quit;

                            Data Set Name        WORK.TEST                     Observations          18

                            Member Type          DATA                          Variables             5

                            Engine               V9                            Indexes               0

                            Created                                            Observation Length    40

                            Last Modified                                      Deleted Observations  1

RW9
Diamond | Level 26 RW9
Diamond | Level 26

A brief few tests on SAS 9.3 shows your first way seems to be the quickest, test code below.  TBH I was quite suprised, I would have thought the SQL would have been quicker, or even the where clause on the read in, but no, slower.

data test1 test2 test3;

  do i="a","b","c","d","e";

    do j=1 to 1000000;

      output;

    end;

  end;

run;

data test1;

  set test1;

  if i="b" then delete;

run;

data test2;

  set test2 (where=(i ne "b"));

run;

proc sql;

  delete from test3 where i="b";

quit;

Ksharp
Super User

Tye this one, which use randomly access way .

Code: Program

data test;
  set sashelp.class;
run;
data test;
modify test;
if sex='F' then remove;
run;

LinusH
Tourmaline | Level 20

Works, but probably not very efficient when you have 50% deletion rate.

Also, I'm not surprised that PROC SQL DELETE is slower than a table scan/rewrite. I don'r know in details what DELETE does but I got a feeling it updates in place, at least logically, to set the delete flag. A rewrite using WHERE does not have to consider marking records for deletion, AND does not even have to read observations to be deleted into the PDV, and off course not to write them back.

Data never sleeps
Kurt_Bremser
Super User

Your #1 is already the best solution. Depending on the layout and size of your dataset, compressing it may help (trade CPU cycles for reduced I/O).

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1363 views
  • 0 likes
  • 6 in conversation