PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

Reply
New Contributor
Posts: 3

PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

I have read here and there (can't find a reference right now, sorry!) that in PROC SQL, when overwriting a dataset that has a generation group, we could safely use the nowarnrecurs option. I don't see any such indication in SAS documentation though, so I'm hesitant.

 

Here's a (bogus) example to illustrate:

 

proc sql nowarnrecurs;
/* initiate a generation group */
create table work.retail(genmax=5) as
select *
from sashelp.retail;

/* overwrite the table */
create table work.retail as
select *
from work.retail
where year > 1989;
quit;

By doing that, the usual WARNING...

    WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

 

Becomes a simple NOTE...

   NOTE: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

 

Now my question is: is it absolutely safe to do that, or could there be a risk of data integrity problems even when the dataset has a generation group?

 

Version: SAS Server 9.3

 

 

Respected Advisor
Posts: 4,820

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

I don't know the answer, but I do like the question!

PG
Super User
Super User
Posts: 7,720

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

Hi,

 

I had not heard of that option in proc sql before.  Whenever I need to overwrite the same dataset i use:

proc sql undopolicy=none;
...

It should be fine within SAS proc SQL, not sure about the interactions when pass-through is used and how the DB handles it.  You could of course just do it in Base SAS and avoid the whole issue.

Trusted Advisor
Posts: 1,116

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

Hi @dominic,

 

In section "Generation Data Sets" of PROC SQL by Example: Using SQL within SAS by Howard Schreier (2008), it says: "Generation data sets ... do not work particularly well with SQL" (p.199) and on p. 203, after experiencing the warning about a recursive table reference: "... with generation data sets in effect, the two references ... actually pertain to two different tables, so the warning seems particularly unfounded. However, you might prefer to avoid such usage ..." (to avoid the log message). The book doesn't mention the (NO)WARNRECURS option because it's based on SAS 9.2 and the option was only introduced in SAS 9.3.

 

Personally, I hardly ever use generation datasets (for no particular reason), so I can't give you a definitive answer. In Windows Explorer it seems that the previous version of the dataset is just renamed, while its last modification date and in fact its binary content remain unchanged. Nevertheless, the update doesn't work with the read-only attribute set. (These are just my observations based on a single ad-hoc example I created using your code!)

 

PROC Star
Posts: 1,673

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

[ Edited ]

The real question is : what on earth does " A consequence of this is a possible data integrity problem" mean in a SAS data storage context? Saying it's better not to use this syntax without further explanations is meaningless.

 

That's a question for tech support, but many people here are keen to know the answer.

 

 

New Contributor
Posts: 3

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

Good point. I will get in touch with tech support and update this thread when things get clearer.
Super User
Super User
Posts: 7,720

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

This has been around since the dawn of time.  Basically databases have to be ACID compliant, that means if there is any problem with an SQL table operation then it can roll back to the previous state, thus say a insert which corrupts the whole table can be avoided as it rolls back.  ANSI SQL has to support this, but SAS doesn't have the rollback function.  The warning is there to tell you, if you take data from a table whilst writing to the same table you could corrupt your data and you cannot roll the operation back.  

To be honest, there is a couple of options to remove this warning as given above, and there is no real reason in the first place why you would a) not be able to create a new dataset hence avoiding the problem, b) use Base SAS in the first place, again the problem goes away.  It is the choices by the OP which have lead to getting that warning.

PROC Star
Posts: 1,673

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

I undertand this, but we are recreating the table here, so this warning seems unjustified and out of place.

New Contributor
Posts: 3

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

[ Edited ]

Ok so here's the answer I got from SAS Technical Support. (My highlights)

------------------------------------------------------------------------------------------------------------------------------------

Dominic,

 

The query at the bottom of the SAS Note you linked to is a query that was found by development to produce data integrity issues.  However, consistently reproducing this behavior is not possible.

 

   (here's the problematic code from Note 12062)

 

proc sql undo_policy=none;
  delete from one
  where id in (select id from one);
quit;

 

A query such as the one below, will not, to our knowledge, produce any data integrity issues:

 

proc sql;
 create table x as
 select * from x;
quit;

 

The PROC SQL developers continue to work to more closely align with the ANSI standards.  As mentioned in the Usage Note, the warning was added to notify users of possible data integrity issues as a result of using recursive table references which are only supported for SQL that is fully ANSI compliant.  Due to SAS not being fully compliant with the SQL ANSI standards, in queries which contain recursive table references, if the query were to fail due to a system error outside of PROC SQL, we cannot guarantee the integrity of the input table. Therefore, we are required to produce the Warning based on the query itself.

 

 

Thanks, 

(Tech name)

SAS Technical Support

Cary, NC

------------------------------------------------------------------------------------------------------------------------------------

 

So basically, what I get out of this is (there's some amount of deduction here so let me know if your interpretations differ):

 

    1. Having a generation dataset does not have any impact on the relevance of the warning -

        PROC SQL will not use the backups and so corrupt output tables would need to be rolled

        back manually => proc delete data=mydata(gennum=0);

 

    2. Overwriting by recursion does not pose any problems as long as

        i. you stay in PROC SQL (which I assume means: not connecting to a remote DBMS via 

           passthru and so on.) 

        ii. nothing is otherwise preventing your query to execute (power outage or computer crash

            maybe?).

 

    3. Updating / Inserting or removing data in a recursive manner might however

        fail, some of the time.

 

Hopefully this clarifies the issue at least to a certain extent.

PROC Star
Posts: 1,673

Re: PROC SQL - is overwriting generation datasets totally safe (so that nowarnrecurs can be used)?

The warning makes sense in the case of the documented Note .

 

SQL code performing table changes (delete, update, etc clauses) should trigger this warning when the table references itself for change conditions.

 

This warning seems unwarranted for a create clause however. The source table will not change until the output table is ready, and no confusion can occur.

 

 if the query were to fail due to a system error outside of PROC SQL, we cannot guarantee the integrity of the input table. Therefore, we are required to produce the Warning based on the query itself. Yes you can. The input table is not modified in any way when a create clause is used. The query can fail all it wants without any integrity issue.

 

There are no recursive table references when a new table is created, even if the table will bear the same name after all is complete.

 

The warning is erroneously displayed in the case of create statements and there is no valid justification for it it seems.

Ask a Question
Discussion stats
  • 9 replies
  • 1762 views
  • 11 likes
  • 5 in conversation