My string variable contains apostrophes and foreign characters in a few observations. For example,
I want to remove the apostrophe (') in the first observation, and delete the second and third observations because they contain characters that are neither alphabet nor number. Is there a good way to do this in SAS?data have; input var $30.; datalines; Let's go for dinner pi�ata dance collective Another 인천 The 4th line ; run;
You can use the COMPRESS function to remove certain characters.
Then there's a lot of function to identify different type of characters in a string:
I would try a combination of the functions starting with NOT...
The weird characters you're getting are likely due to using a single byte encoded editor or environment for multibyte characters (i.e. UTF-8).
Below should satisfy what you're asking for.
data want;
set have;
var=compress(var,"'");
if findc(var,' ','kfnp') then delete;
run;
data want;
set have;
var=compress(var,"'");
if prxmatch('/[^a-z\d\s]/i',var) then delete;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.