tl;dr; For a case-sensitive database in SQL Server, the ODBC engine properly handles column names, but not table names. Note this is related to https://communities.sas.com/t5/SAS-Data-Management/Case-sensitivity-issue-with-ODBC-engine-and-SQL-Server-tables/td-p/349681 Let's say a SAS/Access engine is meant to access an external RBDMS table, Excel file, etc. as though it were a "virtual SAS dataset". Perhaps a sweeping, overly generalized statement, but let's go with it for now. Furthermore, let's say that consistency in how that engine works is a good thing. In this case, I’m referring to consistency in treating both table names and column names in a case-insensitive manner, in the same way that SAS treats SAS datasets. I'm working with SQL Server. I need to work with both case-insensitive tables (Latin1_General_CI_AI collation), and case-sensitive tables (Latin1_General_BIN collation). Some preliminaries/background: Latin1_General_CI_AI: Table names and column names are case-insensitive (in SQL Server itself) Filtering (WHERE clause) is case-insensitive, eg. WHERE current_record='Y' will return Y, y, and various accented y chars (if present in the data) (differs from SAS). Data is sorted (ORDER BY) in a case-insensitive manner (not in ASCII code order, and not matching SAS's default collation/sortseq) Latin1_General_BIN: Table names and column names are case-sensitive (in SQL Server itself). For the SAS programmer, this can be a "pain". Filtering (WHERE clause) is case-sensitive, eg. WHERE current_record='Y' will return Y only (like SAS). Data is sorted (ORDER BY) in a case-sensitive manner (in ASCII code order, and matching SAS's default collation/sortseq). Some test code: Create two databases in SQL Server, say CASE_INS and CASE_SEN. Configure the collation for CASE_INS as Latin1_General_CI_AI. Configure the collation for CASE_SEN as Latin1_General_BIN. Run this code in SQL Server Management Studio for both databases: USE CASE_INS -- then change to CASE_SEN and re-run
GO
DROP TABLE TeSt
CREATE TABLE TeSt (
FOO VARCHAR(3)
,Bar VARCHAR(5)
,BlAh VARCHAR(7)
)
INSERT INTO TeSt VALUES ('foo','BAR','Blah')
SELECT foo, bar, blah FROM test WHERE bar='bar'
SELECT FOO, Bar, BlAh FROM TeSt WHERE Bar='BAR' Note that both SELECT queries work for CASE_INS, but only the second query works in CASE_SEN. Now let's work with these tables in SAS: * set desired SAS options ;
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
options msglevel=I;
options fullstimer;
options mprint mrecall;
options compress=binary;
options ls=max;
options nocenter;
options lognumberformat=1;
libname CASE_INS ODBC NOPROMPT="Driver={SQL Server Native Client 10.0};Server=MYSERVER;Database=CASE_INS;Trusted_Connection=yes;" schema=dbo;
libname CASE_SEN ODBC NOPROMPT="Driver={SQL Server Native Client 10.0};Server=MYSERVER;Database=CASE_SEN;Trusted_Connection=yes;" schema=dbo;
* case insensitive SQL Server database ;
* Note that all these scenarios work as expected ; * (with the possible exception of the case-insensitive where clause results) ;
data test1;
set case_ins.test (keep=foo bar blah);
where bar='bar';
by bar;
run;
data test2;
keep foo bar blah;
set case_ins.test;
where bar='bar';
by bar;
run;
data test3;
format foo bar blah;
set case_ins.test;
where bar='bar';
by bar;
run;
* case sensitive SQL Server database ;
* the below two scenarios fail ;
data test4;
set case_sen.test (keep=foo bar blah);
where bar='bar';
by bar;
run;
data test5;
set case_sen.TEST (keep=foo bar blah);
where bar='bar';
by bar;
run;
* this scenario correctly knows that the table exists, ;
* creates the column list in the correct case, ;
* but does not return the row. ;
* this is fine, since this is a case-sensitive table, ;
* and in fact is more in line with a where clause in SAS itself ;
data test6;
set case_sen.TeSt (keep=foo bar blah);
where bar='bar';
by bar;
run;
* this scenario works as expected ;
data test7;
set case_sen.TeSt (keep=foo bar blah);
where Bar='BAR';
by Bar;
run;
* this scenario fails due to the invalid column name ;
data test8;
set case_sen.TeSt (keep=foo bar blah baz);
where Bar='BAR';
by Bar;
run; In all of the above scenarios, we get these sastrace messages in the SAS log: ODBC: AUTOCOMMIT is NO for connection 3
ODBC: AUTOCOMMIT turned ON for connection id 3
ODBC_1: Prepared: on connection 3
SELECT * FROM "dbo"."test"
39 data test1;
40 set case_ins.test (keep=foo bar blah);
41 where bar='bar';
42 by bar;
43 run;
ODBC_2: Prepared: on connection 3
SELECT "FOO", "Bar", "BlAh" FROM "dbo"."test" WHERE ( "Bar" = 'bar' ) ORDER BY "Bar"
ODBC_3: Executed: on connection 3
Prepared statement ODBC_2 However, for the case-sensitive scenarios, we get these results: ODBC: AUTOCOMMIT is NO for connection 3
ODBC: AUTOCOMMIT turned ON for connection id 3
ODBC_10: Prepared: on connection 3
SELECT * FROM "dbo"."test"
ODBC: ROLLBACK performed on connection 3.
61 data test4;
62 set case_sen.test (keep=foo bar blah);
ERROR: File CASE_SEN.test.DATA does not exist.
63 where bar='bar';
64 by bar;
65 run; While I don't know the internals of the ODBC engine, this is my conjecture: 1) The ODBC engine is using SELECT * from "tablename" to determine if the table exists. 2) The query executes quite quickly, so it's only returning limited (zero?) rows, functionally equivalent to WHERE 0=1. 3) This query is within a TRANSACTION block (or the equivalent) and perhaps a TRY/CATCH block (or the equivalent). This conjecture is due to the ROLLBACK statement above. 4) If the query is unsuccessful, the decision is made that the table does not exist. 5) If the query is successful, the results are saved in a resultset. 6) The header of that resultset is compared to the SAS statements, in a case-insensitive manner, to generate the correct case-sensitive column list. Regardless of whether that conjecture is technically correct, IMO SELECT * from "tablename" is a poor way to determine if the table exists, since it relies on the table name being specified in the correct case. A better approach would be: SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE UPPER(TABLE_SCHEMA)='DBO' AND UPPER(TABLE_NAME)='TEST' or SELECT tbl.name AS TABLE_NAME FROM sys.tables tbl INNER JOIN sys.schemas sch ON UPPER(sch.name)='DBO' AND UPPER(tbl.name)='TEST' My recommendation: 1) Use either of the above approaches to determine if the table exists (and you have the required permissions). Check for the existence of the table in a case-insensitive manner. 2) If not, abort with table does not exist error. 3) If so, use the correct table name (including case) to load the resultset. 4) Use that resultset to build the correct column list (including case). IMO this behaviour would be more consistent with how SAS treats datasets and "virtual datasets" using case-insensitive names. Note: we are currently undergoing a SQL Server upgrade, and migrating from the default of collation of Latin1_General_CI_AI to Latin1_General_BIN, in order to match the default collation in SAS. The above behaviour of the ODBC engine will result in our need to refactor (or at least test) hundreds if not thousands of programs to ensure the table names are specified in the correct case.
... View more