Hi,
Currently we have defined SAS encoding for one of the Level (out of 5) as UTF 8 . All others are in latin - 9.
But in the UTF - 8 session, when I tried to extract the data from different data source from SAS , the data gets truncated in the sas dataset. We have identified that the values which got truncated has special characters.
We have defined the logic for extraction as upcase(trim(left(ColumnA))). But when I remove the upcase in the logic , then I dont see any problem and no data truncation is happening.
I dont understand about the UTF - 8 and upcase dependency on the data truncation ??
Should I need to remove the upcase for these scenario?? Please let me know.
Thanks in advance.
When working with multi-byte char-variables, you have to use the k-functions (ktrim, kupcase etc.), because only those function are capable of handling chars that need more than one byte to be stored correctly.
Yes, this is pure pain.
Try CVP engine.
libname x cvp 'c:\temp' CVPMULTIPLIER=2;
@helannivas88 wrote:
Thanks Ksharp for the info.
But we are using proc sql to connect to the data source as like below
proc sql;
connect to DB2 (datasrc=&Datasrc AUTHDOMAIN="ABC") ;
create table WORK.TEMP_ABC as
select
upcase(trim(kleft(ColA))) as ColA
);
disconnect from DB2;
quit;
We dont store the data in the permanent dataset. Its just a work dataset and then finally append into the database.
Where to use CVPmulitpler option in the proc sql?
THanks
Your SQL code is incomplete. Please post the complete code for diagnosis.
Please find the sas code
proc sql;
connect to DB2 as SRC (datasrc=&Datasrc AUTHDOMAIN="ABC") ;
create table WORK.TEMP as
select
Upcase(trim(left(MSG_DESC))) as MSG_DESC,
MONOTONIC() as ROW_NUM,
input("&SYSDATE9 &SYSTIME",DATETIME25.6) as LOADTM format=DATETIME25.6 informat=DATETIME25.6
from
(
select * from connection to SRC
(
select * from &Schema..TABLENAME
)
);
disconnect from SRC;
quit;
proc append data=WORK.TEMP base=STG.STG_TAB
(
BULKLOAD=YES
BL_METHOD=CLILOAD
);
run;
Please show more details about how you are transferring the data from your remote database into SAS.
I doubt if the UPCASE() function by itself is the real issue, but its might perhaps cause SAS to use a different sequence of operations that could result in truncation.
The reason you might get truncation is that you could have characters that requires only one byte in LATIN-9, but requires two or more bytes in UTF-8. Perhaps by converting to uppercase you are introducing such a character.
You probably just need to make your variable(s) longer to allow room for the extra bytes.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.