BookmarkSubscribeRSS Feed
helannivas88
Obsidian | Level 7

Hi,

 

Currently we have defined SAS encoding for one of the Level (out of 5) as UTF 8 . All others are in latin - 9.

 

But in the UTF - 8 session, when I tried to extract the data from different data source from SAS , the data gets truncated in the sas dataset. We have identified that the values which got truncated has special characters.

 

We have defined the logic for extraction as upcase(trim(left(ColumnA))). But when I remove the upcase in the logic , then I dont see any problem and no data truncation is happening.

 

I dont understand about the UTF - 8 and upcase dependency on the data truncation ?? 

 

Should I need to remove the upcase for these scenario?? Please let me know.

 

Thanks in advance.

8 REPLIES 8
andreas_lds
Jade | Level 19

When working with multi-byte char-variables, you have to use the k-functions (ktrim, kupcase etc.), because only those function are capable of handling chars that need more than one byte to be stored correctly.

Yes, this is pure pain.

helannivas88
Obsidian | Level 7
Db2 is the data source
Ksharp
Super User

Try CVP engine.

 

libname x cvp 'c:\temp' CVPMULTIPLIER=2;
helannivas88
Obsidian | Level 7
Thanks Ksharp for the info.
But we are using proc sql to connect to the data source as like below

proc sql;
connect to DB2 (datasrc=&Datasrc AUTHDOMAIN="ABC") ;
create table WORK.TEMP_ABC as
select
upcase(trim(kleft(ColA))) as ColA
);
disconnect from DB2;
quit;

We dont store the data in the permanent dataset. Its just a work dataset and then finally append into the database.

Where to use CVPmulitpler option in the proc sql?

THanks
Kurt_Bremser
Super User

@helannivas88 wrote:
Thanks Ksharp for the info.
But we are using proc sql to connect to the data source as like below

proc sql;
connect to DB2 (datasrc=&Datasrc AUTHDOMAIN="ABC") ;
create table WORK.TEMP_ABC as
select
upcase(trim(kleft(ColA))) as ColA
);
disconnect from DB2;
quit;

We dont store the data in the permanent dataset. Its just a work dataset and then finally append into the database.

Where to use CVPmulitpler option in the proc sql?

THanks

Your SQL code is incomplete. Please post the complete code for diagnosis.

helannivas88
Obsidian | Level 7

Please find the sas code

 

proc sql;
connect to DB2 as SRC (datasrc=&Datasrc AUTHDOMAIN="ABC") ;
create table WORK.TEMP as
select
Upcase(trim(left(MSG_DESC))) as MSG_DESC,
MONOTONIC() as ROW_NUM,
input("&SYSDATE9 &SYSTIME",DATETIME25.6) as LOADTM format=DATETIME25.6 informat=DATETIME25.6
from
(
select * from connection to SRC
(
select * from &Schema..TABLENAME
)
);
disconnect from SRC;
quit;

 

proc append data=WORK.TEMP base=STG.STG_TAB
(
BULKLOAD=YES
BL_METHOD=CLILOAD
);
run;

Tom
Super User Tom
Super User

Please show more details about how you are transferring the data from your remote database into SAS.

I doubt if the UPCASE() function by itself is the real issue, but its might perhaps cause SAS to use a different sequence of operations that could result in truncation.

The reason you might get truncation is that you could have characters that requires only one byte in LATIN-9, but requires two or more bytes in UTF-8.  Perhaps by converting to uppercase you are introducing such a character.

 

You probably just  need to make your variable(s) longer to allow room for the extra bytes.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1826 views
  • 1 like
  • 5 in conversation