BookmarkSubscribeRSS Feed
parmis
Fluorite | Level 6

Hello,

I have a dataset similar to the following  that contains a text(a single word or phrase) variable. The strings  are either in English or French.

Is there a way to flag the English words?  

data list;

input name $20.;

datalines;

Côté

Boucher

Fournier

Cats

how to register

morning

Thibeault

Martin

Vaudron

Girard

Hello;

run;

 

Thank you!

4 REPLIES 4
art297
Opal | Level 21

May not be possible with just words out of context, but you could try incorporating Python. Take a look at: https://www.probytes.net/blog/python-language-detection/

 

Art, CEO, AnalystFinder.com

 

Ksharp
Super User
data list;
input name $20.;
flag=prxmatch('/[^a-z]/i',compress(name,,'ka'))>0;
datalines;
Côté
Boucher
Fournier
Cats
how to register
morning
Thibeault
Martin
Vaudron
Girard
Hello
;
run;

 

ballardw
Super User

My French is pretty rusty but I do remember that a moderate number of nouns are the same in both French and English.

So without the articles the / a or le/ la /les/ un / une or similar clue those are going to be very problematic.

 

Some adjectives, grand, for example are going to be worse.

 

I would hesitate to assign any name to a specific language as the French and English have been interacting for so long names go both ways (and spelling gets butchered)

Sundaresh1
SAS Super FREQ

Hi @parmis ,

I know this is an answer that comes after 2 years :), but felt that you may derive some benefit nevertheless, knowledge at the least.   In Jan of this year,  SAS released a language identification action as part of its Viya platform.  Here are details on how it works : 

https://go.documentation.sas.com/doc/en/sasstudiocdc/v_009/pgmsascdc/casanpg/cas-textmanagement-iden...

 

regards,

Sundaresh 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1655 views
  • 4 likes
  • 5 in conversation