BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jtheunis
Fluorite | Level 6

Hi,

 

In order to test the Gender Analysis function, I uploaded an export of my Facebook friends list to the CAS server.  In Data Studio, I split the names in first name and last name and used the first name to determine the gender of my friends. I did not expect to have a 100% correct result (it's mix of Belgian, French & other nationalities) but I am somehow puzzled that all first names are considered as 'U'. I switched the Locale to different values but always get the same result...

 

Update:

Using the full name, I get acceptable results. Can someone please explain me why a full name is needed in order to determine the gender?

 

Using SAS Data Studio 2.2 on Viya 3.4.

 

Code snippet:

    "Gender"n = dqgender ("FirstName"n, "Name", "ENGBR");

 

Data Studio:

 

gender_analysis.PNG 

 

Result in VA:

 

list_gender.PNG

 

Any idea?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ShayneGrant
SAS Employee

The dqGender() function will take the input value provided and then attempt to parse the input using the associated parse definition - in this case the parse definition named 'Name'.  The parse definition needs a full name in order to determine what the tokens are (i.e. given name, last name, etc).  If you provide only a single word to the parse definition, it will not be able to properly determine what token the word represents and furthermore the gender definition is not able to accurately determine the gender when there is only one token value which is why it normally returns unknown for single word inputs.

 

If you would like to provide just the first name to the gender definition, you can utilize the dqGenderParsed() function.  This function takes a preparsed input and thus avoids performing the parse.  This should provide the results you are looking for.  Here is a sample of how you would invoke the dqGenderParsed() function assuming you have a variable named 'first_name' that contains the first names you want to perform the gender analysis on, as you first need to get the input into what is called a delimited string.

 

length delm_string $ 200;

length result $ 2;

delm_string=dqParseTokenPut('', first_name, 'Given Name', 'Name');
result=dqGenderParsed(delm_string, 'Name');
put result=;

View solution in original post

2 REPLIES 2
ShayneGrant
SAS Employee

The dqGender() function will take the input value provided and then attempt to parse the input using the associated parse definition - in this case the parse definition named 'Name'.  The parse definition needs a full name in order to determine what the tokens are (i.e. given name, last name, etc).  If you provide only a single word to the parse definition, it will not be able to properly determine what token the word represents and furthermore the gender definition is not able to accurately determine the gender when there is only one token value which is why it normally returns unknown for single word inputs.

 

If you would like to provide just the first name to the gender definition, you can utilize the dqGenderParsed() function.  This function takes a preparsed input and thus avoids performing the parse.  This should provide the results you are looking for.  Here is a sample of how you would invoke the dqGenderParsed() function assuming you have a variable named 'first_name' that contains the first names you want to perform the gender analysis on, as you first need to get the input into what is called a delimited string.

 

length delm_string $ 200;

length result $ 2;

delm_string=dqParseTokenPut('', first_name, 'Given Name', 'Name');
result=dqGenderParsed(delm_string, 'Name');
put result=;

VenuKadari
SAS Employee

Incorporate what ShayneGrant posted using “Code” transform in SAS Data Studio.  Cut and paste following code. Your input table is the table consisting of FirstName variable.

 

/* BEGIN data step with the output table data */

data {{_dp_outputTable}} (caslib={{_dp_outputCaslib}} promote="no");

/* Set the input set */

set {{_dp_inputTable}} (caslib={{_dp_inputCaslib}} );

 

    drop delm_string;

    length result $ 300;

    length delm_string $ 300;

    delm_string=dqParseTokenPut('', FirstName, 'Given Name', 'Name');

    result=dqGenderParsed(delm_string, 'Name');

run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1753 views
  • 2 likes
  • 3 in conversation