Contributor
Posts: 22

# Keeping Year variable while finding percentile of wages by sex (proc univariate)

Hey all,

I ran into a problem and can't figure a way out. Your help is much appreciated.

Here is what I have,

Year             sex               Wage

2007              1                   14

2007              2                    12

2007             1                     12

2007              2                    13

I have data like that for many years(different files). My plan is to find the 10th, 50th, and 90th percentile of wages by sex for each of those year and finally combine them by year to plot a graph to see how things have changed in the last 10 years.

Here is the program I have, and I can find the percentile by sex using PROC Univariate. But I haven't been able to find a way to keep the Year in the output file so that I can combine it later.

proc

univariate data=hourly.out12hourly noprint

;

class

a_sex;

var log_hourly_annual;

output out=sex.percsex12 p10=p10str median=p50str p90

=p90str;

run

;

But I can't find a way to keep the year in the output file. All help is appreciated. Thanks.

Super User
Posts: 23,663

## Re: Keeping Year variable while finding percentile of wages by sex (proc univariate)

Can you add year into the class statement? I'm not sure it works with univariate, so if not try with proc means instead, the rest of the code stays the same, except univariate changes to means.

proc univariate data=hourly.out12hourly noprint;

class year a_sex;

types year*a_sex;

var log_hourly_annual;

output out=sex.percsex12 p10=p10str median=p50str p90=p90str;

run;

Posts: 5,521

## Re: Keeping Year variable while finding percentile of wages by sex (proc univariate)

Why not do all years at once :

/* Merge the datasets */
data hourly;
set hourly.out10hourly hourly.out11hourly hourly.out12hourly;
run;

proc sort data=hourly; by sex year; run;

/* Get the percentiles */
proc univariate data=hourly;
by sex year;
var log_hourly_annual;
output out=perSexYear p10=p10str median=p50str p90=p90str;
run;

proc sgplot data=perSexYear;

series x=year y=p10str / group=sex;

series x=year y=p50str / group=sex;

series x=year y=p90str / group=sex;

run;

PG

PG
PROC Star
Posts: 1,307

## Re: Keeping Year variable while finding percentile of wages by sex (proc univariate)

Another option:

proc means data=out12hourly noprint nway nonobs;

var log_hourly_annual;

class year a_sex;

output out=medians p10()= median()= p90()=/autoname;

run;

Tom

Discussion stats
• 3 replies
• 260 views
• 3 likes
• 4 in conversation