Consider the common SQL task of counting the count(*) over a select group.
data have;
length s $1 t $3; format id 1. ;
do id = 1 to 4;
do rep = 1 to id;
s = cats (id);
t = 't_' || cats (id);
output;
end;
end;
run;
proc sql; select id, s, count(*) as freq format=4. from have group by id, s;
id s freq ----------- 1 1 1 2 2 2 3 3 3 4 4 4
Now add the twist that one of the selects is mapping a non-select column t as select column s.
So, there is now some ambiguation of what s is. Is it the original want.s or the as s.
* s in the group is the non-select s and causes automatic remerging;
proc sql; select 'way 1' as way, id, t as s, count(*) as freq format=4. from have group way, id, s;
way id s freq ------------------------ way 1 1 1 other 1 way 1 2 2 other 2 way 1 2 2 other 2 way 1 3 3 other 3 way 1 3 3 other 3 way 1 3 3 other 3 way 1 4 4 other 4 way 1 4 4 other 4 way 1 4 4 other 4 way 1 4 4 other 4
Grouping by column numbers disambiguates the role of t as s
proc sql; select 'way 2' as way, id, t as s, count(*) as freq format=4. from have group 1, 2, 3;
way id s freq -------------------- way 2 1 t_1 1 way 2 2 t_2 2 way 2 3 t_3 3 way 2 4 t_4 4
Selecting t as new column s_new also disambiguates
proc sql; select 'way 3' as way, id, t as s_new, count(*) as freq format=4. from have group way, id, s_new;
way id s_new freq ---------------------- way 3 1 t_1 1 way 3 2 t_2 2 way 3 3 t_3 3 way 3 4 t_4 4
Trying to remain in the context of column names, calculated s causes an ERROR
proc sql; select 'way 4' as way, id, t as s, count(*) as freq format=4. from have group way, id, calculated s;
ERROR: The following columns were not found as CALCULATED references in the immediate query: s.
My code will have to go the 1,2,3 way.
Regardless, is there a syntax modification to have the as s column in the group clause using the column name s (in other words something like calculated s that does not cause an ERROR)
You left the BY keyword out. And you don't need to include a constant variable like your new WAY in the BY group. But those are not the issue I think you are asking about.
Basically SAS is saying that just changing the name is not really making a new variable.
So you have two choices. Remove the CALCULATED keyword.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq
from have
group by id, s_new
;
If you are worried that there is possible ambiguity in the new name, then use the old name in the GROUP BY clause.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq format=4.
from have
group by id, have.t
;
Or add some none destructive operation, like TRIM() for character variables or SUM(var,.) for numeric variables, so that it will treat it as calculated.
select 'way 3' as way
, id
, trim(t) as s_new
, count(*) as freq format=4.
from have
group by id, calculated s_new
;
And if all you wanted was the change the variable name you could just use the RENAME= dataset option on the source dataset.
select 'way 3' as way
, id
, s_new
, count(*) as freq format=4.
from have(rename=(t=s_new))
group by id, s_new
;
You left the BY keyword out. And you don't need to include a constant variable like your new WAY in the BY group. But those are not the issue I think you are asking about.
Basically SAS is saying that just changing the name is not really making a new variable.
So you have two choices. Remove the CALCULATED keyword.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq
from have
group by id, s_new
;
If you are worried that there is possible ambiguity in the new name, then use the old name in the GROUP BY clause.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq format=4.
from have
group by id, have.t
;
Or add some none destructive operation, like TRIM() for character variables or SUM(var,.) for numeric variables, so that it will treat it as calculated.
select 'way 3' as way
, id
, trim(t) as s_new
, count(*) as freq format=4.
from have
group by id, calculated s_new
;
And if all you wanted was the change the variable name you could just use the RENAME= dataset option on the source dataset.
select 'way 3' as way
, id
, s_new
, count(*) as freq format=4.
from have(rename=(t=s_new))
group by id, s_new
;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.