BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jtheunis
Fluorite | Level 6

Hi,

 

In order to test the Gender Analysis function, I uploaded an export of my Facebook friends list to the CAS server.  In Data Studio, I split the names in first name and last name and used the first name to determine the gender of my friends. I did not expect to have a 100% correct result (it's mix of Belgian, French & other nationalities) but I am somehow puzzled that all first names are considered as 'U'. I switched the Locale to different values but always get the same result...

 

Update:

Using the full name, I get acceptable results. Can someone please explain me why a full name is needed in order to determine the gender?

 

Using SAS Data Studio 2.2 on Viya 3.4.

 

Code snippet:

    "Gender"n = dqgender ("FirstName"n, "Name", "ENGBR");

 

Data Studio:

 

gender_analysis.PNG 

 

Result in VA:

 

list_gender.PNG

 

Any idea?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ShayneGrant
SAS Employee

The dqGender() function will take the input value provided and then attempt to parse the input using the associated parse definition - in this case the parse definition named 'Name'.  The parse definition needs a full name in order to determine what the tokens are (i.e. given name, last name, etc).  If you provide only a single word to the parse definition, it will not be able to properly determine what token the word represents and furthermore the gender definition is not able to accurately determine the gender when there is only one token value which is why it normally returns unknown for single word inputs.

 

If you would like to provide just the first name to the gender definition, you can utilize the dqGenderParsed() function.  This function takes a preparsed input and thus avoids performing the parse.  This should provide the results you are looking for.  Here is a sample of how you would invoke the dqGenderParsed() function assuming you have a variable named 'first_name' that contains the first names you want to perform the gender analysis on, as you first need to get the input into what is called a delimited string.

 

length delm_string $ 200;

length result $ 2;

delm_string=dqParseTokenPut('', first_name, 'Given Name', 'Name');
result=dqGenderParsed(delm_string, 'Name');
put result=;

View solution in original post

2 REPLIES 2
ShayneGrant
SAS Employee

The dqGender() function will take the input value provided and then attempt to parse the input using the associated parse definition - in this case the parse definition named 'Name'.  The parse definition needs a full name in order to determine what the tokens are (i.e. given name, last name, etc).  If you provide only a single word to the parse definition, it will not be able to properly determine what token the word represents and furthermore the gender definition is not able to accurately determine the gender when there is only one token value which is why it normally returns unknown for single word inputs.

 

If you would like to provide just the first name to the gender definition, you can utilize the dqGenderParsed() function.  This function takes a preparsed input and thus avoids performing the parse.  This should provide the results you are looking for.  Here is a sample of how you would invoke the dqGenderParsed() function assuming you have a variable named 'first_name' that contains the first names you want to perform the gender analysis on, as you first need to get the input into what is called a delimited string.

 

length delm_string $ 200;

length result $ 2;

delm_string=dqParseTokenPut('', first_name, 'Given Name', 'Name');
result=dqGenderParsed(delm_string, 'Name');
put result=;

VenuKadari
SAS Employee

Incorporate what ShayneGrant posted using “Code” transform in SAS Data Studio.  Cut and paste following code. Your input table is the table consisting of FirstName variable.

 

/* BEGIN data step with the output table data */

data {{_dp_outputTable}} (caslib={{_dp_outputCaslib}} promote="no");

/* Set the input set */

set {{_dp_inputTable}} (caslib={{_dp_inputCaslib}} );

 

    drop delm_string;

    length result $ 300;

    length delm_string $ 300;

    delm_string=dqParseTokenPut('', FirstName, 'Given Name', 'Name');

    result=dqGenderParsed(delm_string, 'Name');

run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 2191 views
  • 2 likes
  • 3 in conversation