BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DebbiBJ
Obsidian | Level 7

Hi,

I am trying to merge two SAS files, one of which has names with accent marks, the other has same names but unaccented.  How can I get SAS to ignore the accents while merging?  (I don't care if they're lost.)  I've tried wading through the National Language Support documentation, but am not finding anything.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

As far as I know, there is no general function for dealing with accented characters. You can either create an intermediary dataset with a new variable and use a datastep merge, or use SQL and join on an expression.

You don't say what language your accented characters come from. A solution corresponding to the French language would be to match on a variable/expression like:

matchString = translate(lowcase(accentedString),"aaceeeeiiouu","àâçéèêëîïôùû");

PG

PG

View solution in original post

6 REPLIES 6
PGStats
Opal | Level 21

As far as I know, there is no general function for dealing with accented characters. You can either create an intermediary dataset with a new variable and use a datastep merge, or use SQL and join on an expression.

You don't say what language your accented characters come from. A solution corresponding to the French language would be to match on a variable/expression like:

matchString = translate(lowcase(accentedString),"aaceeeeiiouu","àâçéèêëîïôùû");

PG

PG
DebbiBJ
Obsidian | Level 7

The dataset is in English, but because there are first and last names, the accents can come from various languages.  But I think they're mostly generic French/Spanish accents.  Thanks for the suggestion; I'll give it a try.

DebbiBJ
Obsidian | Level 7

That worked great!  (Once I found an extended character map)  Thanks so much!

scmebu
SAS Employee

/*

   Demonstrate that the DATA STEP is sensitive to linguistic

   collating sequences and this can be used to perform a merge

   that is insensitive to case or accents.

   Here, we're merging/joining two data sets, one containing

   monthly revenue with another containing a monthly count of

   customers, to calculate revenue per customer.

*/

data clients;

  length mois $ 10;

  infile datalines delimiter=',';

  input mois compte;

  datalines;

  janvier, 370

  février, 400

  mars, 430

  avril, 415

  mai, 410

  juin, 450

  juillet, 449

  août, 403

  septembre, 339

  novembre, 375

  décembre, 370

;

run;

data revenu;

  length mois $ 10;

  infile datalines delimiter=',';

  input mois ventes;

  datalines;

  JANVIER, 376784

  FEVRIER, 396911

  MARS, 441327

  AVRIL, 419272

  MAI, 408291

  JUIN, 443791

  JUILLET, 442111

  AOUT, 402771

  SEPTEMBRE, 337727

  NOVEMBRE, 381929

  DECEMBRE, 376771

;

run;

proc sort data=clients sortseq=linguistic(strength=1);

  by mois;

run;

proc sort data=revenu sortseq=linguistic(strength=1);

  by mois;

run;

data resultat;

  merge clients revenu;

  by mois;

  revenuparclient = ventes/compte;

run;

proc print;

run;

DebbiBJ
Obsidian | Level 7

Thank you!  This alternate approach will also be useful to me in the future.

scmebu
SAS Employee

FYI, a similar approach can be taken with PROC SQL and a join using the SORTKEY function.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 9798 views
  • 6 likes
  • 3 in conversation