SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

I import a excel file with proc import into a SAS table and than put that data into a teradata table.

I get a error when I load the data into the teradata table. Bad character.

 

We checked the excel file and in certain rows we saw hidden non printable characters (ZWSP en NBSP).

 

We are trying to filter those characters out with the following code.

DATA work.filtered;

   SET work.unfiltered;

   name_new = PRXCHANGE('s/\x{200B}//',-1, name); /* does not work to filter out ZWSP */

   name_new = PRZCHANGE('s/\&zwsp;//',-1, name);     /* does not work to filter out ZWSP */

RUN;

 

What am I doing wrong or is there a other solution within SAS.

1 ACCEPTED SOLUTION

Accepted Solutions
Amir
PROC Star

Hi,

 

Consider using the compress()  function, using 'kw' as the modifiers to keep only printable characters.

 

For example:

 

data filtered;
  set unfiltered;
  name_new = compress(name,,'kw');
run;

 

 

 

Thanks & kind regards,

Amir.

 

Edit: Added code sample.

View solution in original post

3 REPLIES 3
Amir
PROC Star

Hi,

 

Consider using the compress()  function, using 'kw' as the modifiers to keep only printable characters.

 

For example:

 

data filtered;
  set unfiltered;
  name_new = compress(name,,'kw');
run;

 

 

 

Thanks & kind regards,

Amir.

 

Edit: Added code sample.

Tom
Super User Tom
Super User

I assume your data and SAS session are using UTF-8 encoding to be able actual represent a zero-width space character.

https://www.fileformat.info/info/unicode/char/200b/index.htm

 

Do you want to REMOVE those characters completely?

name_new = kcompress(name,'C2A0E2808B'x); 

Or replace them with spaces?

name_new = ktranslate(name,'  ','C2A0E2808B'x);

Or perhaps remove the ZWSP and replace the NBSP?

name_new = kcompress(ktranslate(name,' ','C2A0'x),'E2808B'x);

 

If your SAS session is using a single byte encoding, like WLATIN1, then you will need to use TRANWRD() to replace the characters.  

name_new = tranwrd(tranwrd(name,'C2A0'x,' '),'E2808B'x,' ');

Or to remove the ZWSP use TRANSTRN.

name_new = transtrn(tranwrd(name,'C2A0'x,' '),'E2808B'x,trimn(' '));
Richardvan_tHoff
Obsidian | Level 7
Both Solutions work. Thank you both.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 711 views
  • 3 likes
  • 3 in conversation