Hi, when renaming a variable in a proc sql, is the correct technique to reference the old or new variable name (see the two variables that have been renamed in the select statement) in the group statement (see sc.book1 and sc.scoreband_new that are currently the variables names in the group statement below)?
Should it be:
1) group by sc.scorecard_num, book, scoreband
2) group by sc.scorecard_num, sc.book1, sc.scoreband_new
proc sql;
create table testing as
select sc.scorecard_num
,sc.book1 as book
,sc.scoreband_new as scoreband
,count(*) as Volume
from arr_banded sc
left join (select scorecard_num, count(*) as total
from arr_banded group by scorecard_num) as tot
on sc.scorecard_num = tot.scorecard_num
group by sc.scorecard_num, sc.book1, sc.scoreband_new
order by sc.scorecard_num;
quit;
With the SAS SQL flavor both syntax options are valid. Other SQL flavors don't allow to reference an alias name (renamed column) in a group by clause.
As long as the values of the column name and the alias name are the same the result will be the same - but be aware of the difference it can make if that's not the case.
proc sql;
select
-round(age,10) as age2
,avg(height) as avg_height
from sashelp.class
group by age2
order by age2
;
quit;
proc sql;
select
-round(age,10) as age2
,avg(height) as avg_height
from sashelp.class
group by age
order by age
;
quit;
proc sql;
select
-round(age,10) as age2
,avg(height) as avg_height
from sashelp.class
group by age2
order by age
;
quit;
With other SQL flavors that don't allow to use an alias name in a group by clause you would likely need to formulate your SQL along the line of below.
proc sql;
select
age2,
avg(height) as avg_height
from
(
select
-round(age,10) as age2
,height
from sashelp.class
)
group by age2
order by age2
;
quit;
With the SAS SQL flavor both syntax options are valid. Other SQL flavors don't allow to reference an alias name (renamed column) in a group by clause.
As long as the values of the column name and the alias name are the same the result will be the same - but be aware of the difference it can make if that's not the case.
proc sql;
select
-round(age,10) as age2
,avg(height) as avg_height
from sashelp.class
group by age2
order by age2
;
quit;
proc sql;
select
-round(age,10) as age2
,avg(height) as avg_height
from sashelp.class
group by age
order by age
;
quit;
proc sql;
select
-round(age,10) as age2
,avg(height) as avg_height
from sashelp.class
group by age2
order by age
;
quit;
With other SQL flavors that don't allow to use an alias name in a group by clause you would likely need to formulate your SQL along the line of below.
proc sql;
select
age2,
avg(height) as avg_height
from
(
select
-round(age,10) as age2
,height
from sashelp.class
)
group by age2
order by age2
;
quit;
@Justin9 To add to above: If interfacing with a database you would need the ORDER BY clause in the SQL to ensure that you receive the data sorted on the SAS side. When GROUP BY executes on the SAS side that's not required because GROUP BY will already return the data sorted.
proc sql _method;
create table test as
select
age as age2
,avg(height) as avg_height
from sashelp.class
group by age
order by age
;
quit;
proc contents data=test;
run;quit;
IF you still want to use an ORDER BY clause for clarity then use the aliased names so SAS doesn't sort the data twice.
proc sql _method;
create table test as
select
age as age2
,avg(height) as avg_height
from sashelp.class
group by age2
order by age2
;
quit;
proc sql _method;
create table test as
select
age as age2
,avg(height) as avg_height
from sashelp.class
group by age
order by age
;
quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.