Calcite | Level 5

percentage by gender of wages below each decile

Hello,

I would like to calculate the percentage of men and women with a salary below each decile (<p10, <p20...). So I have a binary variable gender, a variable for salaries (from 1 to 20'000).

I tried to separate my database into 10 equal parts with proc rank.

```proc rank data=database groups=10 descending out=ranked;
var wages;
ranks decile;
run;```

Then I sorted it by gender and did a proc freq for the first decile (the idea is to repeat this for each decile).

``````proc sort data=ranked
out=ranked_sort;
by sex;
run;

proc freq data= ranked_sort ;

where decile = 9 ;	* p10 ;

table wages*sex ;
run;``````

I find a total of % for men and for women but I also have the details for each wage (1-3000) in the decile. I would like to have only the total for the decile.

Does anyone have an idea how to do this?

Jo

1 ACCEPTED SOLUTION

Accepted Solutions
SAS Super FREQ

Re: percentage by gender of wages below each decile

If there are no tied values, then, by definition,  you will get of employees 10% of values below the first decile, 20% below the second decile, and so forth. So, I guess you are trying to look at whether males/females differ in their proportions?  Because another option is to look at the empirical distribution curves separately for males and females. If the two curves differ, that tells you whether the distribution of salaries differs between genders:

``````data Have;
set sashelp.heart;
keep Sex Cholesterol;
run;

proc univariate data=Have;
class Sex;
var Cholesterol;
cdfplot Cholesterol;
run;

``````

But if you want to use the PROC RANK and PROC FREQ idea, see if this helps:

``````
proc rank data=Have groups=10 descending out=ranked;
var Wages;
ranks decile;
run;

proc freq data= ranked;
table sex*decile / list out=ListOut;
run;

proc means data=ListOut Sum;
class Sex;
var Count;
run;

/*
Female Sum=2873
Male   Sum=2336
*/
data Want;
set ListOut;
if Sex='Female' then Prop = Count / 2873;
else Prop = Count / 2336;
run;

proc print data=Want;
run;``````
4 REPLIES 4
Diamond | Level 26

Re: percentage by gender of wages below each decile

I would like to calculate the percentage of men and women with a salary below each decile (<p10, <p20...). So I have a binary variable gender, a variable for salaries (from 1 to 20'000).

Isn't the percent less than p10 equal to 10 percent? Isn't the percent less than p20 equal to 20 percent?

--
Paige Miller
Calcite | Level 5

Re: percentage by gender of wages below each decile

Thanks for the comment. Of course you are right but I meant I am trying to see wether the proportions of men and women differ in each decile?
For p10: X% of men and X% of women (=100%).
SAS Super FREQ

Re: percentage by gender of wages below each decile

If there are no tied values, then, by definition,  you will get of employees 10% of values below the first decile, 20% below the second decile, and so forth. So, I guess you are trying to look at whether males/females differ in their proportions?  Because another option is to look at the empirical distribution curves separately for males and females. If the two curves differ, that tells you whether the distribution of salaries differs between genders:

``````data Have;
set sashelp.heart;
keep Sex Cholesterol;
run;

proc univariate data=Have;
class Sex;
var Cholesterol;
cdfplot Cholesterol;
run;

``````

But if you want to use the PROC RANK and PROC FREQ idea, see if this helps:

``````
proc rank data=Have groups=10 descending out=ranked;
var Wages;
ranks decile;
run;

proc freq data= ranked;
table sex*decile / list out=ListOut;
run;

proc means data=ListOut Sum;
class Sex;
var Count;
run;

/*
Female Sum=2873
Male   Sum=2336
*/
data Want;
set ListOut;
if Sex='Female' then Prop = Count / 2873;
else Prop = Count / 2336;
run;

proc print data=Want;
run;``````
Calcite | Level 5

Re: percentage by gender of wages below each decile

Thank you Rick_SAS for the solution.

As you supposed, I am trying to look at whether males/females differ in their proportions. The second part of your answer with PROC RANK and PROC FREQ is exaclty what I need.

Thank you for help.
Regards,
Jo
Discussion stats
• 4 replies
• 695 views
• 1 like
• 3 in conversation