- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I import a excel file with proc import into a SAS table and than put that data into a teradata table.
I get a error when I load the data into the teradata table. Bad character.
We checked the excel file and in certain rows we saw hidden non printable characters (ZWSP en NBSP).
We are trying to filter those characters out with the following code.
DATA work.filtered;
SET work.unfiltered;
name_new = PRXCHANGE('s/\x{200B}//',-1, name); /* does not work to filter out ZWSP */
name_new = PRZCHANGE('s/\&zwsp;//',-1, name); /* does not work to filter out ZWSP */
RUN;
What am I doing wrong or is there a other solution within SAS.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Consider using the compress() function, using 'kw' as the modifiers to keep only printable characters.
For example:
data filtered;
set unfiltered;
name_new = compress(name,,'kw');
run;
Thanks & kind regards,
Amir.
Edit: Added code sample.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Consider using the compress() function, using 'kw' as the modifiers to keep only printable characters.
For example:
data filtered;
set unfiltered;
name_new = compress(name,,'kw');
run;
Thanks & kind regards,
Amir.
Edit: Added code sample.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I assume your data and SAS session are using UTF-8 encoding to be able actual represent a zero-width space character.
https://www.fileformat.info/info/unicode/char/200b/index.htm
Do you want to REMOVE those characters completely?
name_new = kcompress(name,'C2A0E2808B'x);
Or replace them with spaces?
name_new = ktranslate(name,' ','C2A0E2808B'x);
Or perhaps remove the ZWSP and replace the NBSP?
name_new = kcompress(ktranslate(name,' ','C2A0'x),'E2808B'x);
If your SAS session is using a single byte encoding, like WLATIN1, then you will need to use TRANWRD() to replace the characters.
name_new = tranwrd(tranwrd(name,'C2A0'x,' '),'E2808B'x,' ');
Or to remove the ZWSP use TRANSTRN.
name_new = transtrn(tranwrd(name,'C2A0'x,' '),'E2808B'x,trimn(' '));
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content