04-03-2013 08:23 PM
I ran into a problem and can't figure a way out. Your help is much appreciated.
Here is what I have,
Year sex Wage
2007 1 14
2007 2 12
2007 1 12
2007 2 13
I have data like that for many years(different files). My plan is to find the 10th, 50th, and 90th percentile of wages by sex for each of those year and finally combine them by year to plot a graph to see how things have changed in the last 10 years.
Here is the program I have, and I can find the percentile by sex using PROC Univariate. But I haven't been able to find a way to keep the Year in the output file so that I can combine it later.
univariate data=hourly.out12hourly noprint
output out=sex.percsex12 p10=p10str median=p50str p90
But I can't find a way to keep the year in the output file. All help is appreciated. Thanks.
04-03-2013 09:14 PM
Can you add year into the class statement? I'm not sure it works with univariate, so if not try with proc means instead, the rest of the code stays the same, except univariate changes to means.
proc univariate data=hourly.out12hourly noprint;
class year a_sex;
output out=sex.percsex12 p10=p10str median=p50str p90=p90str;
04-03-2013 10:00 PM
Why not do all years at once :
/* Merge the datasets */
set hourly.out10hourly hourly.out11hourly hourly.out12hourly;
proc sort data=hourly; by sex year; run;
/* Get the percentiles */
proc univariate data=hourly;
by sex year;
output out=perSexYear p10=p10str median=p50str p90=p90str;
proc sgplot data=perSexYear;
series x=year y=p10str / group=sex;
series x=year y=p50str / group=sex;
series x=year y=p90str / group=sex;
04-04-2013 11:26 AM
proc means data=out12hourly noprint nway nonobs;
class year a_sex;
output out=medians p10()= median()= p90()=/autoname;