In this inline view, I'm grouping by Department, but I want to use a calculation (co_avg_salary) which I do not want grouped. avg_salary will be grouped by Department and then divided by the co_avg_salary (the average salary for the whole company). How do I co_avg_salary without grouping it?
proc sql;
select Department, Employees, avg_salary, pct_salary, pay_level
from (select Department,(count(Employee_ID)) as Employees,
mean(salary) as avg_salary 'Average Salary' format=dollar9.,
(mean(salary)/calculated Employees) as co_avg_salary,
(calculated avg_salary/calculated co_avg_salary) as pct_salary 'Dept vs. Overall' format=percent8.1,
case
when (calculated avg_salary > calculated co_avg_salary)
then 'Above Avg.'
else 'Avg. or Below'
end as pay_level 'Pay Level'
from costs.payroll
where Department ne 'Executive' and Employees > 7
group by Department) as job;
quit;
Thanks,
Mike
Why not join to a subquery instead? Not sure what join (join, cross join) will work, but the following is a start I believe (untested).
proc sql;
create table want as
select Department,(count(Employee_ID)) as Employees,
mean(salary) as avg_salary 'Average Salary' format=dollar9.,
(mean(salary)/calculated Employees) as co_avg_salary,
(calculated avg_salary/ b.co_avg_salary) as pct_salary 'Dept vs. Overall' format=percent8.1,
case
when (calculated avg_salary > b.co_avg_salary)
then 'Above Avg.'
else 'Avg. or Below'
end as pay_level 'Pay Level'
from costs.payroll
cross join
(select mean(salary) as co_avg_salary where department ne 'Executive' and employees > 7) as b
where Department ne 'Executive' and Employees > 7
group by Department as job;
quit;
Why not join to a subquery instead? Not sure what join (join, cross join) will work, but the following is a start I believe (untested).
proc sql;
create table want as
select Department,(count(Employee_ID)) as Employees,
mean(salary) as avg_salary 'Average Salary' format=dollar9.,
(mean(salary)/calculated Employees) as co_avg_salary,
(calculated avg_salary/ b.co_avg_salary) as pct_salary 'Dept vs. Overall' format=percent8.1,
case
when (calculated avg_salary > b.co_avg_salary)
then 'Above Avg.'
else 'Avg. or Below'
end as pay_level 'Pay Level'
from costs.payroll
cross join
(select mean(salary) as co_avg_salary where department ne 'Executive' and employees > 7) as b
where Department ne 'Executive' and Employees > 7
group by Department as job;
quit;
Thanks!
Mike,
two thoughts, off the top of my head -
1.) two queries one to get number of employees and average salary for entire company, attributing each to a (macro) variable, and the other to calculate the individual department numbers.
2.) As Reeza mentions, use a sub-query to calculate the aforementioned company figures, then join to the aggregates for each department:
something like
proc sql;
select department, avg(salary) as calculated dept_avg_salary, dept_avg_salary/company_avg_salary as pct_of_overall format=percent8.1
from employees inner join (select avg(salary) as company_avg_salary from employees)...
quit;
Reorganize your query as:
select
Department,
Employees,
avg_salary,
avg_salary*sum(employees)/sum(totSalary) as pct_salary 'Dept vs. Overall' format=percent8.1,
case
when (calculated pct_salary > 1) then 'Above Avg.'
else 'Avg. or Below'
end as pay_level 'Pay Level'
from
(select
Department,
count(Employee_ID) as Employees,
mean(salary) as avg_salary 'Average Salary' format=dollar9.,
sum(salary) as totSalary
from costs.payroll
group by Department)
where Department ne 'Executive' and Employees > 7;
quit;
where I assumed that the overall mean salary is the mean employee salary and not the mean salary of the departments, but that it excludes executives and employees from departments with less than 8 employees.
PG
If you don't want to exclude small departments from overall mean salary calculation then the query can be simpler:
select
Department,
count(Employee_ID) as Employees,
mean(salary) as avg_salary 'Average Salary' format=dollar9.,
calculated avg_salary/
(select mean(salary) from costs.payroll where Department ne 'Executive')
as pct_salary 'Dept vs. Overall' format=percent8.1,
case
when (calculated pct_salary > 1) then 'Above Avg.'
else 'Avg. or Below'
end as pay_level 'Pay Level'
from costs.payroll
where Department ne 'Executive'
group by Department
having count(Employee_ID) > 7;
quit;
PG
That should be very easy in SAS as you can just replace the reference to the source dataset with an in-line query to add in the average.
Replace
from costs.payroll
with
from (select *,mean(salary) as co_avg_salary from costs.payroll)
Other flavors of SQL might complain, but SAS will happily calculate the average and attach the same value to every observation. You will see a note about needing to re-merge.
Tom,
From a computational/datastep-processing perspective does your in-line query calculate and re-calculate the co_avg_salary?
Regards,
Chad
I assume that it will read all of the data once to find the mean and then need to read it again to begin processing the rest of the query.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.