BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ansepans
Calcite | Level 5

I need to create a gender variable from a Danish social security number. I am using SAS University edition for mac.

 

I want my data to look like this:

 

social security number      gender

xxxxxx-xxx7                       1

xxxxxx-xxx4                       0

xxxxxx-xxx3                       1

xxxxxx-xxx2                       0

xxxxxx-xxx1                       1

xxxxxx-xxx8                       0

 

when the last digit is an even number it is a female and uneven number is a male.

 

I have used proc format to create a new variable, but I that's all I have been able to do.

 

proc format
value gender
0="Female"
1="Male";
run;

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

like this?

 

data have;
input cpr $20.;
datalines;
280279-1667
280279-1668
280279-1667
;

data want;
	set have;
	if substr(cpr,length(cpr),1)in ("0","2","4","6","8") then gender=0;
	else gender=1; 
run;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

like this?

 

data have;
input cpr $20.;
datalines;
280279-1667
280279-1668
280279-1667
;

data want;
	set have;
	if substr(cpr,length(cpr),1)in ("0","2","4","6","8") then gender=0;
	else gender=1; 
run;
PeterClemmensen
Tourmaline | Level 20

You can apply your format from your PROC FORMAT afterwards, but beware that you are missing a semicolon after PROC FORMAT statement. It should be

 

proc format;
value gender
0="Female"
1="Male";
run;
mkeintz
PROC Star

proc format does NOT create a new variable.  It only controls how a variable is displayed.

 

In your case you need to create a new variable to which your new format can be displayed.  Assuming that the Danish social security number is a character variable, you can just retrieve the 11th character with the CHAR function, convert it to a numeric value, and then find the remainder after division by 2:

 

proc format;
  value gender
    0="Female"
    1="Male";
run;

data t;
  input ssn $11.;
  g=input(char(ssn,11),1.);
  g=mod(g,2);
  format g gender.;
  put (_all_) (=);
datalines;
xxxxxx-xxx7
xxxxxx-xxx4
xxxxxx-xxx3
xxxxxx-xxx2
xxxxxx-xxx1
xxxxxx-xxx8
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ansepans
Calcite | Level 5

I need to create a gender variable from a Danish social security number.

 

I want my data to look like this:

 

social security number      gender

xxxxxx-xxx7                       1

xxxxxx-xxx4                       0

xxxxxx-xxx3                       1

xxxxxx-xxx2                       0

xxxxxx-xxx1                       1

xxxxxx-xxx8                       0

 

when the last digit is an even number it is a female and uneven number is a male.

 

I have used proc format to create a new variable, but I that's all I have been able to do.

 

proc format
value gender
0="Female"
1="Male";
run;

Thanks!

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
Develop Code with SAS Studio

Get started using SAS Studio to write, run and debug your SAS programs.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 3864 views
  • 4 likes
  • 3 in conversation