04-04-2014 04:02 AM
Hopefully this is in the right place as it's my first question.
We are developing an application where part of the data handling requirements is to read and write local SAS files (Server comes later).
We prefer not to have any specific SAS assembly references in our code.
Following the cookbook recipe, we can write SAS files using the IOM Provider in _Local_ mode.
However, the examples still use ADODB, which is only possible in 32-bit, as no 64-bit version will ever exist of ADODB.dll
I have tried rewriting in ADO .NET using a DataAdapter but this is doomed to failure because the IOM provider does not support parameterised queries.
My only other option appears to be to repeatedly construct and execute single row SQL insert queries.
This works, but is, not surprisingly, very slow.
Can anyone suggest a better method to achieve this?
04-06-2014 06:08 PM
There is no easy way to write to SAS datasets.
I guess when you talk about "single row SQL insert queries" your programm inserts one row and waits for the OleDB provider to finish until the next row is sent. So you might get a performance boost if you insert multiple rows with a single SQL statement. You can insert multiple rows with a SET or VALUES clause (see http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002294531.htm for example).
If you find a better way for writing a SAS dataset with OleDB please post your solution here. I'm looking for it since years. It's about time that someone writes an awesome ADO.NET Entity Framework provider for SAS tables :smileylaugh:
04-09-2014 03:37 AM
Andreas, many thanks.
Using multiple VALUES sets in the insert query does give a very significant performance boost - it reduces my test write time from 8s to 2s, writing 5 rows instead of 1 row in each insert.
The next question is, does anyone know the character size limit of the query text? I could then dynamically adjust the number of rows I write for each insert to be the maximum within this limit.
04-21-2014 03:03 PM
it reduces my test write time from 8s to 2s, writing 5 rows instead of 1 row in each insert.
That is still very slow. I guess that most of that time is used to open the dataset and initialize the driver. Can you try 1000 rows and measure the performance?
The next question is, does anyone know the character size limit of the query text?
There is no documented limit. Maybe 32K (32768) characters.
As a best practice you should try different lengths. If there is a limit it may be different in each SAS version and x86 and x64.
04-28-2014 05:15 AM
I didn't say how many rows I wrote in total, only how many I did in each operation! That was for a 1000 rows.
I've now adjusted my code to dynamically adjust the number of rows per operation, based upon the query length as it is built up.
Thanks for your input.
04-28-2014 06:18 AM
Sorry keV as you mentioned several issues
I was focussed on you 32-bit 64-bit issue. That is more common and the approach is to have all components on the same bitness. That is what my link was for.
Rereading your question, I have more questions as answers. You are building something in .Net as Eguide AMO are also .Net applications you are building some functionality of that in the same way. The behavior on IOM and COM will be similar. I know those tools better. Builing in ADO .Net never needed to do. I know around Andreas it is/was done.
The "com" option is used for a local SAS dataset. It will start a SAS process for that, you should be able to see that running.
In your .Net coding you will see CreateWorkspaceByServer RemoveWorkspace that must be the one doing that.
As starting a sas-process will take a lot of time (many seconds) starting and stopping this will cost a lot of time.
When the sas-process or the datasets behind is badly tuned you will have a bad performance.
As far I can see you can submit SAS language code using Ado.
(SAS(R) 9.3 Integration Technologies: Windows Client Developer's Guide Using Local SAS with ADO - sample )
When you go for an IOM server. Also a SASs process need to run. I am recognizing the pooling and normal workspace servers as started by a object spawner (service).
The normal workspace server is normally run (at OS level Unix) by a personal key and needing that password/login being propagated. The start time of that process needed to add for a response. The pooledworkspace server will run by a shared account (like the STP) and by that sharing you will not see the start process overhead.