BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8

Hi,

  How would we select all those address fields in dataset that have foriegn symbols (not the regular symbols like @,#,$) some of them like french symbols or german.

 

6 REPLIES 6
LinusH
Tourmaline | Level 20

Can you be more specific, what char are ok, which are not?

Data never sleeps
SASPhile
Quartz | Level 8

This is for addresses in the US:

charcters like .,-space& are allowed.

french accents not allowed.

german charcters like ö not allwoed.

ballardw
Super User

The easiest way would be to build a TRANSLATE statement in a data step. The fun part is getting the correct codes as your editor font may not match the font you are used to looking at, not to mention potential UNICODE or other encoding issues.

Something that might look like this:

string = translate(string,'AAAAA','ÀÁÂÃÄ');

You really want to look at the documentation for translate as it is postional replacement and the target and source strings need to match carefully. Plus the order of parameters seems backwards to most people I've discussed this with.

 

If the input data is straight ASCII or EBCDIC then this code will build a set with the value RANK returns for single characters and the ASCII character (defaulting to the viewer font).

data chars;
   length character $ 1;
   do i= 127 to 255;
      character =collate(i);
      output;
   end;
run; 
SASPhile
Quartz | Level 8
thats really cool. But how is it matched to the dataset and see if any of the characters are present?
ballardw
Super User

The use would be in a very stubby bit of code.

data want;
   set have;
   addressline1 = translate(addressline1,'<targetstring>','<searchstring>');
run;

The joy of translate is that you already made the decision what would be done when they are encountered so you don't need a message unless you really want one. If were concerned you could start with a base variable and use a recoded value, then compare the two to generate messages about likely issues in the base data.

 

LinusH
Tourmaline | Level 20
Normalisation of US addresses is available within the dataflux data quality product.
Data never sleeps

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1267 views
  • 0 likes
  • 3 in conversation