Hello everyone!
We have a table in the database with Cyrillic characters.
To work correctly, the format of this field must be 2 times longer than the length of the field in the database. When we use SAS Access to oracle, sas sets the correct format 2-4 times larger. But when we use SAS Access to ms server, the format is equal to the length of the field.
How can I fix this so that when updating metadata, the format of the field becomes 2-4 times the length of the field in the database?
@John_Wick wrote:
Metadata is updated via the SAS Management Console.
The type of field in databases is varchar.
Tiers:
1) Oracle encoding is utf8, SAS session is cp1251.
2) MS Server encoding is utf-8, SAS session is utf-8.
As I understand it, cp1251 is a single byte character set. and UTF-8 is multi-byte. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.2/nlsref/n0ck4l40qlw2pmn1lbuj2zo1izxi.htm
I think the difference in SAS session encodings is probably causing the difference. In your cp1251 SAS session that talks to Oracle, SAS knows it is going from a single byte character set to multi-byte, so it expands the lengths and formats of variables. In your utf-8 SAS session that talks to MS SQL Server, since both sessions are UTF-8 SAS doesn't need to modify the lengths of variables. You could try testing this by using SAS Access to SQL Server from the SAS session with cp1251 encoding, and see if it then does the variable length expansion you are expecting.
Please post the current code you are using to update both Oracle and SQL server. Also please provide details of the column definitions in the databases. Is it varchar by any chance? Also can you check your SAS encoding setting by running this:
proc options option=encoding;
run;
@John_Wick wrote:
Metadata is updated via the SAS Management Console.
The type of field in databases is varchar.
Tiers:
1) Oracle encoding is utf8, SAS session is cp1251.
2) MS Server encoding is utf-8, SAS session is utf-8.
As I understand it, cp1251 is a single byte character set. and UTF-8 is multi-byte. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.2/nlsref/n0ck4l40qlw2pmn1lbuj2zo1izxi.htm
I think the difference in SAS session encodings is probably causing the difference. In your cp1251 SAS session that talks to Oracle, SAS knows it is going from a single byte character set to multi-byte, so it expands the lengths and formats of variables. In your utf-8 SAS session that talks to MS SQL Server, since both sessions are UTF-8 SAS doesn't need to modify the lengths of variables. You could try testing this by using SAS Access to SQL Server from the SAS session with cp1251 encoding, and see if it then does the variable length expansion you are expecting.
Depending on how you are updating the database the DBTYPE option might be useful in overriding the default database column lengths although obviously that will only work for new tables.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.