Help using Base SAS procedures

Proc sql - long duration time of deleting data from table

Reply
Occasional Contributor
Posts: 5

Proc sql - long duration time of deleting data from table

Hi,

 

I have one little problem. Namely I have connection with database (Teradata) through SAS and I am processing sql query on Teradata side. I would like delete data from table, but I have figured out only one solution below:

 

proc sql noprint;

delete from DDHDP01P.HP_KHD4491_PESEL;

quit;

 

The problem is that is completely inefficient and lasts ages. Do you probably know any other (quicker) way to do that.

 

Thank you very much.

Super User
Posts: 5,430

Re: Proc sql - long duration time of deleting data from table

I don't know really what hapends with a SQL like this and implict pass thru.

options sastrace=',,,d' sastraceloc=saslog nostsuffix msglevel=i;

will might give you some hints on what's going on.

By a quick Google search, the Teradata DELETE ALL should be the quickest way to empty a table. If SAS doesn't give you this, try that in an explicit SQL pass thru step.

Data never sleeps
Occasional Contributor
Posts: 5

Re: Proc sql - long duration time of deleting data from table

Yes, of course I know that DELETE TABLE is fastest way to do this kind of think but I am creating the automation process in which I need delete data from table before next prod append.

 

Anyway thank you for your response and I see how helpfull it will be when I would follow your advise.

Super User
Posts: 7,808

Re: Proc sql - long duration time of deleting data from table

[ Edited ]

The quickest way (edit: in SAS!) is to delete the table and recreate it.

Using "delete from" forces ths DBMS to scan the table record by record and delete every single one; no wonder that it lasts ages, I would be hard pressed to find a less efficient way.

I'd try something like

- rename the table

- recreate the original table from the renamed table with obs=0

- delete the renamed table

 

Or, if you have a library defined for the DBMS:

data library.table;
set library.table (obs=0);
run;

if the DBMS supports that kind of operation.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 5,430

Re: Proc sql - long duration time of deleting data from table

I wrote "DELETE ALL", not DELETE TABLE, which I don't think exists, you may think of DROP TABLE?

Se this thread that elaborates between the two:

http://stackoverflow.com/questions/27818798/teradata-delete-all-vs-dropcreate

Data never sleeps
Occasional Contributor
Posts: 5

Re: Proc sql - long duration time of deleting data from table

Yes I meant delete all. I wrote it without previous checkicng and that was result of this inaccuracy. 

Super User
Super User
Posts: 7,970

Re: Proc sql - long duration time of deleting data from table

Why are you doing this in SAS.  The most efficient way to do this would be to use the database to upload new data.  Adding a third party in, even with the best coding in the world, will slow down the overall process.

Occasional Contributor
Posts: 5

Re: Proc sql - long duration time of deleting data from table

I have taken over the procces of automation and I am still trying to improve few things. We have changed the tool for automation in my company and that's why I am doing this in SAS. Moreover I am fully aware that SAS  is not most efficient way to uploading data.

Generally everything is being worked well, but the most significant problem turning out beceouse of this thing.

Super User
Posts: 11,343

Re: Proc sql - long duration time of deleting data from table

As a variation of Kurt's suggestion, create a table with the structure you need and no records. You could then overwrite the table with empty one.

Super Contributor
Posts: 441

Re: Proc sql - long duration time of deleting data from table

[ Edited ]

Hi @Hb21Pzn,

 

I think the performance question is still not answered. Why does a "delete from table"  takes so long for an RDBMS? First of all this is not a SAS issue. If you do this in Teradata admin or BTEQ it will take equally long. For starters it is because the RDBMS does journalling.  When a transaction fails it needs to be rolled back. Creating such a rollback log is very expensive and caqn potentially overflow buffers and bring a system down. Secondly an index may need rework for every deleted row. Also very expensive indeed.  Another cause can be the repeated validation of referential integrity (RI) constraints. If another table has a integrity dependency on your table every delete will require a validation of that constraint. These checks can be disabled or postponed but sooner or later they will cost you. Due to al this a "delete from table" is considered bad practice and will lead to a visit and a fatherly speach from your trusty DBA if you do it often (meaning more than once).

 

Oracle has a TRUNCATE function. TD has the ALL keyword: DELETE  FROM TABLE ALL;.  In TD the ALL keyword avoids transaction logging of each row's deletion. It wipes the cyliner headers and marks the tables as empty. It is called a fastpath delete. This means a brute force deletion of rows neglecting indexes and RI constraints. Messy but fast. Maybe just what you need

 

The DI Studio data loader has options for you to choose how to clear data from a table because of all this. Hope you can now  tell why.

 

So that's why dropping and re-creating the table altogether is often your best bet. Or maybe, if you use explicit passthrough, you can code the ALL keyword in your SQL.

 

Hope this helps,

- Jan.

 

* Edited for typos.

Occasional Contributor
Posts: 5

Re: Proc sql - long duration time of deleting data from table

Posted in reply to jklaverstijn
Thank you very much. I will try it and maybe it would that what I am looking for so long Smiley Happy
Ask a Question
Discussion stats
  • 10 replies
  • 338 views
  • 2 likes
  • 6 in conversation