BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DebbiBJ
Obsidian | Level 7

Hi,

I am trying to merge two SAS files, one of which has names with accent marks, the other has same names but unaccented.  How can I get SAS to ignore the accents while merging?  (I don't care if they're lost.)  I've tried wading through the National Language Support documentation, but am not finding anything.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

As far as I know, there is no general function for dealing with accented characters. You can either create an intermediary dataset with a new variable and use a datastep merge, or use SQL and join on an expression.

You don't say what language your accented characters come from. A solution corresponding to the French language would be to match on a variable/expression like:

matchString = translate(lowcase(accentedString),"aaceeeeiiouu","àâçéèêëîïôùû");

PG

PG

View solution in original post

6 REPLIES 6
PGStats
Opal | Level 21

As far as I know, there is no general function for dealing with accented characters. You can either create an intermediary dataset with a new variable and use a datastep merge, or use SQL and join on an expression.

You don't say what language your accented characters come from. A solution corresponding to the French language would be to match on a variable/expression like:

matchString = translate(lowcase(accentedString),"aaceeeeiiouu","àâçéèêëîïôùû");

PG

PG
DebbiBJ
Obsidian | Level 7

The dataset is in English, but because there are first and last names, the accents can come from various languages.  But I think they're mostly generic French/Spanish accents.  Thanks for the suggestion; I'll give it a try.

DebbiBJ
Obsidian | Level 7

That worked great!  (Once I found an extended character map)  Thanks so much!

scmebu
SAS Employee

/*

   Demonstrate that the DATA STEP is sensitive to linguistic

   collating sequences and this can be used to perform a merge

   that is insensitive to case or accents.

   Here, we're merging/joining two data sets, one containing

   monthly revenue with another containing a monthly count of

   customers, to calculate revenue per customer.

*/

data clients;

  length mois $ 10;

  infile datalines delimiter=',';

  input mois compte;

  datalines;

  janvier, 370

  février, 400

  mars, 430

  avril, 415

  mai, 410

  juin, 450

  juillet, 449

  août, 403

  septembre, 339

  novembre, 375

  décembre, 370

;

run;

data revenu;

  length mois $ 10;

  infile datalines delimiter=',';

  input mois ventes;

  datalines;

  JANVIER, 376784

  FEVRIER, 396911

  MARS, 441327

  AVRIL, 419272

  MAI, 408291

  JUIN, 443791

  JUILLET, 442111

  AOUT, 402771

  SEPTEMBRE, 337727

  NOVEMBRE, 381929

  DECEMBRE, 376771

;

run;

proc sort data=clients sortseq=linguistic(strength=1);

  by mois;

run;

proc sort data=revenu sortseq=linguistic(strength=1);

  by mois;

run;

data resultat;

  merge clients revenu;

  by mois;

  revenuparclient = ventes/compte;

run;

proc print;

run;

DebbiBJ
Obsidian | Level 7

Thank you!  This alternate approach will also be useful to me in the future.

scmebu
SAS Employee

FYI, a similar approach can be taken with PROC SQL and a join using the SORTKEY function.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 8863 views
  • 6 likes
  • 3 in conversation