03-02-2015 08:35 AM
Hello everyone. I have some sas code that writes to a sql table. I did not set up the sql table and I don't control the operations that are working on it.
It looks like the person who set up the table is also having an application write to and read from the table. His application is sometimes deadlocking the table, resulting in my attempted writes to the table failing.
Is there a way in base sas to test if a sql server is locked before running a transaction against it?
If no: if there a macro variable that is created that is different from &Sqlrc that would return a code unique to "Deadlock" so that I would know the cause of the error?
The main problem I am having is the code is failing on insert, and returning a sqlRc of 8. I need it to be more descriptive, as this is the same code that returns for numerous reasons.
03-02-2015 09:04 AM
First, SAS isn't my first choice of application to run RDBMS transaction processing. Or didn't you intend to do so?
What is the underlying requirement, and how does these deadlocks occur?
Without knowing these things,it sounds like you need tot do some redesign.
03-02-2015 09:11 AM
I can only use sas, as that is all I have access too. Is SAS not one of the most used ETL tools in the world? If it is, why would it be bad at RDBMS transcation processing? (i'm curious).
The requirements are to write to a set of tables after a lengthly ETL. However the tables are getting dead locked by a completely independent Application reading and writing to them.
I have no idea how the dead locks are happening for sure, but i'm 90% confident that when the Application is doing deletes and updates to these tables there is no transaction level processing within teh app code. THerefore the app transactions are locking each-other out. I "know" this because even before my ETL in SAS started the process was getting dead-locks (which was learned from the developer after the ETL was written).
As such, I just need a way to let the developer know his tables are deadlocked, and also to let the sas code know so it doesn't try to write to the tables but instead sends out an email when this occurs.
In terms of a re-deisgn, what would you do for the following situation.. WE have an app that reads/ writes to these tables. SAS is the base ETL that has to pull from Many different systems and write into these tables. We need the writes to never be blocked, and the application to never be blocked from reading... What would you recommend in that case? If the answer is NOT SAS please elaborate as to why
03-02-2015 09:18 AM
SAS and ETL - yes!
But have SAS ETL load the same tables, that an external application is updating as well, sounds like it's bad design, or am I misunderstanding?
You are sure that your own ETL jobs are not locking each other out? You need to control this in your scheduling/job flows.
03-02-2015 09:26 AM
Okay thanks for the clarification I was not sure what you were saying haha!
And yes, you are understanding it correctly, and yes I 100% agree it is a bad design. (I think it's a bad design. This is my first ETL that other applications are touching the table with..). Can you give me an outline of how you would switch this up and design it differently? The person who set it up argued that if I wrote to stage tables, we would have to write an SSIS etl that copied the stage table data to "other" final tables that the app touched, and it would be more problamatic than the Application touching the tables I was writing too. I know nothing about how the application works, so I assumed he knew what he was talking about (could have been an incorrect assumption).
I am pretty sure that my code isn't locking the table. Having said that my evidence is based upon 'belief' and not any strong evidence. My belief is because for a very long period my ETL wasn't touching these tables, and the persons code was still dead-locking when users were updating fields. What i'm almost confident is happening if that if 2 users have updates within 5 seconds of eachother, the table is locking itself while trying to do both updates (Because the app developer didn't specify the level of locking or the inherently transaction level of his events). Therefore it is defaulting to a full table lock, and then throwing deadlocks.
I am not sure how to test for A sql server table being dead-locked in a SAS etl, and that's kinda the point of my question. I want to "confirm" (or know) that my process isn't locking the table by setting the correct properties. Given that a re-design is entirely out of my hands (I will meet with the app designer and see if I can get them to redesign it....) I want to do whatever I can to make things work better that is within my power.
Does that help? (Also thanks for your help)
03-03-2015 03:03 AM
Manipulating a table in an external database is NOT ETL.
Extract, Transform, Load. After the Extract part, the external database should be history, and everything else should be local to SAS.
Even if you use a relational database as storage for SAS, this would be a private DB for SAS, and no dependencies with non-SAS processes (that could loclk tables) would exist.
If you need data created in the data warehouse to be exported to another database, set up a process that is synchronized with other operations (or uses a service window).
03-03-2015 04:51 AM
Stop using a database underneath SAS. When using base SAS tables, these are represented as individual files on the OS level, and can be deleted even if opened by another process (at least on UNIX). After deletion, recreating them is no problem.
If by "sql table" you actually mean a base SAS dataset that is just written by proc sql, I can show you my method of avoiding crashes because of locked (opened) files.
03-03-2015 06:00 AM
Can't say I agree with you @KurtBremser on the last one.
Even though it's easier from a SAS perspective to use SAS tables, there are many other concerns when chosing a data store. On of the fine things about SAS is you can read and update almost any database on the market. It's an architectural job to design these connections the right way.
Which brings me to the question.
Still, we know too little about your application to give some concrete advice.
What kind of data are you loading via ETL?
What kind of updates are done by the external application?
Perhaps it's different attributes, so a redesign of the data model could help...?
Sounds like you are guessing about the deadlocks. Go to your DBA's and have them help you to find the cause.
03-03-2015 09:57 AM
Sorry Kurt but I have to disagree with basically both of your posts completely. That's just my opinion, but I feel it for many many reasons. SAS it perfectly able to use a SQL SERVER database as it's underlying storage, and I don't see WHY you wouldn't do this. Especially if those tables are accessed from reporting systems that are independent of SAS.
Example: IF we use microstrategy as our BI reporting tool, but SAS as our ETL tool, then SAS tables already don't make sense, and SQL server (or oracle, or some other database system) is really the ONLY way to go. Obviously this isn't true if one uses SSIS for their ETL tool, but I don't.
I AM NOT MANIPULATING DATA IN THIS DATABASE TABLE. I am inserting data into this database table. Then another independent process is manipulating the data in this table.... I AM doing an ETL because I am EXTRACTING data from multiple sources (excel files, other databases, etc..), then TRANSFORMING the data through sas datasets, and then LOADING the results into this table in order for a user application to be able to read and display the data to users. This data wouldn't exist without the ETL by definition.
However I also do not agree with you that manipulating data in a database table is not an ETL. by definition you are extracting data from a source, transforming it and then loading it back into that source.. I'm not saying that's ever a good idea, but i'm saying it is still an ETL (at least the way I was taught).
Linus Hjorth, thanks for your continued support. We did have the DBA confirm yesterday that the other persons application is causing the deadlocks on the table. His application is doing read committed from the table, and also doing updates to the same tables within seconds of eachother, and he didn't set the locking level meaning it was defaulting to table locks. Therefore when two updates tried to run within 1/10 of a second of eachother the table was deadlock itself. He is going to work to fix the problems his application is causing, but the original question still remains. How would sas tell if a sql server database table was deadlock before trying to do anything against it?
As an fyi for you, my process writes every column to the database table except for two. His process then reads every column and runs Update clauses joining on a key (loannumber) and updates those two columns. however his application also has the ability to update the other columns I populate. Therefore he is 'updating' every column that I am loading (possibly).
I think I know where you were going tho. I imagine you were going to suggest that I only load specific columns to a specific table, and whatever columns can be updated by the application would be segregated into a table that only the application touches that way the two systems can't lock each-other out (I suggested that a while ago).
03-04-2015 04:35 AM
Sorry, I don't know how to check for a deadlock. If there are some SQL Server specific code to do that (check with your DBA), you can have that execute in a explicit SQL pass-thru block, and try to evaluate the result of that somehow.
I'm just have little more than a general knowledge of SQLS, but since you are just inserting, you could perhaps do it in a separate partition - don't know if that solves any deadlocks.
If you consider some kind of redesign I think you need an architect onsite that can evaluate IT and business requirements (for both you ETL and the external app) and propose a solution.
03-04-2015 07:46 AM
I wish we had an architect.... The app developer is basically the architect of his apps database.... and when I have my own projects I am an architect for my projects... Neither of us have any kind of training in database design.... It leads to interesting situations like this.
Alright well thank you for your help. I also don't know of a way to check for dead-locks so i'll chalk this up to impossible given your advice.