SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

SAS query to SQL Server DB returns duplicate ID

Posts: 23

SAS query to SQL Server DB returns duplicate ID

[ Edited ]

We have a number of SAS DI jobs that query MS SQL Server (2008) via OLE DB.  I recently noticed that the result was returning duplicates from a field with a primary key contraint - so this shouldn't be possible.  


On further inspection I can recreate the error from the ETL job that is returning two fields from the source table.  However, these fields are being processed in SAS (number >> character and a data extraction) - so no 'raw' data is returned from the SQL query.  It appears that the duplicate primary keys are coming through because the source table is being modified by the source system during the SAS query execution (as shown by a record change datetime field).  I can recreate this error over and over again in a simple macro.


When I change the macro to first - Select * from sourceTable - and then perfrom the SAS conversions to return the two fields (see above) no duplicate primary keys come through.


So it would appear that running a SQL statement that does SAS specific conversions vs. running a SQL statement that (as far I know) uses SQL pass-thru - ends up returning duplicates even through there is a contraint on the source table.  So SAS must not be locking the table (or getting a lock from the DMBS).


Is anyone able to point me at some technical details that cover this issue and why it occurs.  The issue is a little perplexing in that I have had to investigate something that I've been told cannot occur.  Furthermore, this now appears to be a SAS specific issue when we have blamed the source system (MS SQL) because of being told that SAS would get a lock on the table and the cause cannot be a SAS issue.


Hope this makes sense but happy to elaborate further if more information is needed.

Super User
Posts: 9,919

Re: SAS query to SQL Server DB returns duplicate ID

We're using a completely different RDBMS (UDB or DB/2), but we run our extracts from snapshots, so we always get consistent states.

Freezing a snapshot is necessary to prevent not only inconsistencies in tables, but also between them. Consider running critical extracts during a time window where the DB is stopped, if you can't create snapshots.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 5,852

Re: SAS query to SQL Server DB returns duplicate ID

Agree that you have a basic integrity issue in your extract logic.
But regardless it sounds odd that a record will duplicate during a select because of update...Perhaps you should involve SAS tech support to dig into this...?
Data never sleeps
Posts: 810

Re: SAS query to SQL Server DB returns duplicate ID



Just a very wild guess, have a look at READ_ISOLATION_LEVEL option



Ask a Question
Discussion stats
  • 3 replies
  • 1 like
  • 4 in conversation