Hi,
How would we select all those address fields in dataset that have foriegn symbols (not the regular symbols like @,#,$) some of them like french symbols or german.
Can you be more specific, what char are ok, which are not?
This is for addresses in the US:
charcters like .,-space& are allowed.
french accents not allowed.
german charcters like ö not allwoed.
The easiest way would be to build a TRANSLATE statement in a data step. The fun part is getting the correct codes as your editor font may not match the font you are used to looking at, not to mention potential UNICODE or other encoding issues.
Something that might look like this:
string = translate(string,'AAAAA','ÀÁÂÃÄ');
You really want to look at the documentation for translate as it is postional replacement and the target and source strings need to match carefully. Plus the order of parameters seems backwards to most people I've discussed this with.
If the input data is straight ASCII or EBCDIC then this code will build a set with the value RANK returns for single characters and the ASCII character (defaulting to the viewer font).
data chars;
length character $ 1;
do i= 127 to 255;
character =collate(i);
output;
end;
run;
The use would be in a very stubby bit of code.
data want;
set have;
addressline1 = translate(addressline1,'<targetstring>','<searchstring>');
run;
The joy of translate is that you already made the decision what would be done when they are encountered so you don't need a message unless you really want one. If were concerned you could start with a base variable and use a recoded value, then compare the two to generate messages about likely issues in the base data.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.