BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jo_
Calcite | Level 5 jo_
Calcite | Level 5

Hello, 

I would like to calculate the percentage of men and women with a salary below each decile (<p10, <p20...). So I have a binary variable gender, a variable for salaries (from 1 to 20'000).

 

I tried to separate my database into 10 equal parts with proc rank.

 

 

proc rank data=database groups=10 descending out=ranked;
var wages;
ranks decile;
run;

 

 

Then I sorted it by gender and did a proc freq for the first decile (the idea is to repeat this for each decile).

 

proc sort data=ranked
		out=ranked_sort;
	by sex;
run;
 
proc freq data= ranked_sort ;

where decile = 9 ;	* p10 ;
	
table wages*sex ;	

run;

I find a total of % for men and for women but I also have the details for each wage (1-3000) in the decile. I would like to have only the total for the decile.

 

 

Does anyone have an idea how to do this?

 

Thanks in advance, regards, 

Jo

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

If there are no tied values, then, by definition,  you will get of employees 10% of values below the first decile, 20% below the second decile, and so forth. So, I guess you are trying to look at whether males/females differ in their proportions?  Because another option is to look at the empirical distribution curves separately for males and females. If the two curves differ, that tells you whether the distribution of salaries differs between genders:

 

data Have;
set sashelp.heart;
keep Sex Cholesterol;
run;

proc univariate data=Have;
class Sex;
var Cholesterol;
cdfplot Cholesterol;
run;

But if you want to use the PROC RANK and PROC FREQ idea, see if this helps:


proc rank data=Have groups=10 descending out=ranked;
   var Wages;
   ranks decile;
run;
 
proc freq data= ranked;
table sex*decile / list out=ListOut;	
run;

proc means data=ListOut Sum;
class Sex;
var Count;
run;

/*
Female Sum=2873
Male   Sum=2336
*/
data Want;
set ListOut;
if Sex='Female' then Prop = Count / 2873;
else Prop = Count / 2336;
run;

proc print data=Want;
run;

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

I would like to calculate the percentage of men and women with a salary below each decile (<p10, <p20...). So I have a binary variable gender, a variable for salaries (from 1 to 20'000).

 

Isn't the percent less than p10 equal to 10 percent? Isn't the percent less than p20 equal to 20 percent?

--
Paige Miller
jo_
Calcite | Level 5 jo_
Calcite | Level 5
Thanks for the comment. Of course you are right but I meant I am trying to see wether the proportions of men and women differ in each decile?
For p10: X% of men and X% of women (=100%).
Rick_SAS
SAS Super FREQ

If there are no tied values, then, by definition,  you will get of employees 10% of values below the first decile, 20% below the second decile, and so forth. So, I guess you are trying to look at whether males/females differ in their proportions?  Because another option is to look at the empirical distribution curves separately for males and females. If the two curves differ, that tells you whether the distribution of salaries differs between genders:

 

data Have;
set sashelp.heart;
keep Sex Cholesterol;
run;

proc univariate data=Have;
class Sex;
var Cholesterol;
cdfplot Cholesterol;
run;

But if you want to use the PROC RANK and PROC FREQ idea, see if this helps:


proc rank data=Have groups=10 descending out=ranked;
   var Wages;
   ranks decile;
run;
 
proc freq data= ranked;
table sex*decile / list out=ListOut;	
run;

proc means data=ListOut Sum;
class Sex;
var Count;
run;

/*
Female Sum=2873
Male   Sum=2336
*/
data Want;
set ListOut;
if Sex='Female' then Prop = Count / 2873;
else Prop = Count / 2336;
run;

proc print data=Want;
run;
jo_
Calcite | Level 5 jo_
Calcite | Level 5
Thank you Rick_SAS for the solution.

As you supposed, I am trying to look at whether males/females differ in their proportions. The second part of your answer with PROC RANK and PROC FREQ is exaclty what I need.

Thank you for help.
Regards,
Jo

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 370 views
  • 1 like
  • 3 in conversation