BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AMMAN
Obsidian | Level 7

This is my first time using PROC SQL.  My task is to identify the average, min, and max salary by department.  I have successfully calculated the stats, but cannot seem to figure out how to present each department uniquely.   I have no errors on my log.

proc sql.JPG

proc sql;
create table salinfo as
select 
Dept_ID,
Dept_Name,
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine;
quit;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You have a misunderstanding about how a SQL query works. 

 

If you select a variable that is not summarized, ie a group variable it has to be in the GROUP clause. 

 

In most RDBMS it will generate an error. 

In SAS, it will return all rows with the average duplicated for each row. 

 

If you want it grouped, you need to make sure any variables that are not aggregated are in the GROUP clause. 

 

proc sql;
create table salinfo as
select 
Dept_ID,
Dept_Name, /*both need to be in group otherwise you get each row*/
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine;
quit;

Summary Tasks also work well. 

 


@AMMAN wrote:

This is my first time using PROC SQL.  My task is to identify the average, min, and max salary by department.  I have successfully calculated the stats, but cannot seem to figure out how to present each department uniquely.   I have no errors on my log.

proc sql.JPG

proc sql;
create table salinfo as
select 
Dept_ID,
Dept_Name,
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine;
quit;

 


 

View solution in original post

8 REPLIES 8
collinelliot
Barite | Level 11

GROUP BY is what you need to add.

AMMAN
Obsidian | Level 7

I get the same result.  It orders them by department name, but lists all the rows. 

collinelliot
Barite | Level 11

You ran the code below and got the same results?

 

proc sql;
create table salinfo as
select 
Dept_ID,
Dept_Name,
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine
group by Dept_ID,
               Dept_Name;
quit;
Reeza
Super User

You have a misunderstanding about how a SQL query works. 

 

If you select a variable that is not summarized, ie a group variable it has to be in the GROUP clause. 

 

In most RDBMS it will generate an error. 

In SAS, it will return all rows with the average duplicated for each row. 

 

If you want it grouped, you need to make sure any variables that are not aggregated are in the GROUP clause. 

 

proc sql;
create table salinfo as
select 
Dept_ID,
Dept_Name, /*both need to be in group otherwise you get each row*/
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine;
quit;

Summary Tasks also work well. 

 


@AMMAN wrote:

This is my first time using PROC SQL.  My task is to identify the average, min, and max salary by department.  I have successfully calculated the stats, but cannot seem to figure out how to present each department uniquely.   I have no errors on my log.

proc sql.JPG

proc sql;
create table salinfo as
select 
Dept_ID,
Dept_Name,
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine;
quit;

 


 

collinelliot
Barite | Level 11

That's a much better and more detailed explanation...

 

It isn't 100% true, though. In this case, since department ID and department have a one-to-one relationship, you could just have one in the group by but add a "distinct" to the select clause. 

 

proc sql;
create table salinfo as
select distinct 
Dept_ID,
Dept_Name,
mean (Salary) as AverageSalary format=comma12.,
min(Salary) as MinSalary format=comma12.,
max(Salary) as MaxSalary format=comma12.
from work.combine
group by dept_id;
quit;
Reeza
Super User
The one to one statement is an assumption, though it's likely correct 🙂


I think GROUP BY is faster than Distinct.

I was coming to say that I thought you should have gotten the answer actually, your answer actually shows how to do it correctly, mine does not!
collinelliot
Barite | Level 11

Assumption based on the data he provided, but I agree it's better to not assume the one-to-one. I think I'll be okay without getting credit. I'll cry a bit, but I'll recover.

AMMAN
Obsidian | Level 7

Thank you both very much for the input.   The assumption is true, in this case, there is a 1-to-1 relationship.  However, I know how to do it in cases where it is not.

 

I didn't realize that indicating a post as the resolution gave credit.  I am happy to revise it.

 

AMMAN

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1764 views
  • 6 likes
  • 3 in conversation