BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Richardvan_tHoff
Obsidian | Level 7

I import a excel file with proc import into a SAS table and than put that data into a teradata table.

I get a error when I load the data into the teradata table. Bad character.

 

We checked the excel file and in certain rows we saw hidden non printable characters (ZWSP en NBSP).

 

We are trying to filter those characters out with the following code.

DATA work.filtered;

   SET work.unfiltered;

   name_new = PRXCHANGE('s/\x{200B}//',-1, name); /* does not work to filter out ZWSP */

   name_new = PRZCHANGE('s/\&zwsp;//',-1, name);     /* does not work to filter out ZWSP */

RUN;

 

What am I doing wrong or is there a other solution within SAS.

1 ACCEPTED SOLUTION

Accepted Solutions
Amir
PROC Star

Hi,

 

Consider using the compress()  function, using 'kw' as the modifiers to keep only printable characters.

 

For example:

 

data filtered;
  set unfiltered;
  name_new = compress(name,,'kw');
run;

 

 

 

Thanks & kind regards,

Amir.

 

Edit: Added code sample.

View solution in original post

3 REPLIES 3
Amir
PROC Star

Hi,

 

Consider using the compress()  function, using 'kw' as the modifiers to keep only printable characters.

 

For example:

 

data filtered;
  set unfiltered;
  name_new = compress(name,,'kw');
run;

 

 

 

Thanks & kind regards,

Amir.

 

Edit: Added code sample.

Tom
Super User Tom
Super User

I assume your data and SAS session are using UTF-8 encoding to be able actual represent a zero-width space character.

https://www.fileformat.info/info/unicode/char/200b/index.htm

 

Do you want to REMOVE those characters completely?

name_new = kcompress(name,'C2A0E2808B'x); 

Or replace them with spaces?

name_new = ktranslate(name,'  ','C2A0E2808B'x);

Or perhaps remove the ZWSP and replace the NBSP?

name_new = kcompress(ktranslate(name,' ','C2A0'x),'E2808B'x);

 

If your SAS session is using a single byte encoding, like WLATIN1, then you will need to use TRANWRD() to replace the characters.  

name_new = tranwrd(tranwrd(name,'C2A0'x,' '),'E2808B'x,' ');

Or to remove the ZWSP use TRANSTRN.

name_new = transtrn(tranwrd(name,'C2A0'x,' '),'E2808B'x,trimn(' '));
Richardvan_tHoff
Obsidian | Level 7
Both Solutions work. Thank you both.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1109 views
  • 3 likes
  • 3 in conversation