Consider the common SQL task of counting the count(*) over a select group.
data have;
length s $1 t $3; format id 1. ;
do id = 1 to 4;
do rep = 1 to id;
s = cats (id);
t = 't_' || cats (id);
output;
end;
end;
run;
proc sql; select id, s, count(*) as freq format=4. from have group by id, s;
id s freq ----------- 1 1 1 2 2 2 3 3 3 4 4 4
Now add the twist that one of the selects is mapping a non-select column t as select column s.
So, there is now some ambiguation of what s is. Is it the original want.s or the as s.
* s in the group is the non-select s and causes automatic remerging;
proc sql; select 'way 1' as way, id, t as s, count(*) as freq format=4. from have group way, id, s;
way id s freq ------------------------ way 1 1 1 other 1 way 1 2 2 other 2 way 1 2 2 other 2 way 1 3 3 other 3 way 1 3 3 other 3 way 1 3 3 other 3 way 1 4 4 other 4 way 1 4 4 other 4 way 1 4 4 other 4 way 1 4 4 other 4
Grouping by column numbers disambiguates the role of t as s
proc sql; select 'way 2' as way, id, t as s, count(*) as freq format=4. from have group 1, 2, 3;
way id s freq -------------------- way 2 1 t_1 1 way 2 2 t_2 2 way 2 3 t_3 3 way 2 4 t_4 4
Selecting t as new column s_new also disambiguates
proc sql; select 'way 3' as way, id, t as s_new, count(*) as freq format=4. from have group way, id, s_new;
way id s_new freq ---------------------- way 3 1 t_1 1 way 3 2 t_2 2 way 3 3 t_3 3 way 3 4 t_4 4
Trying to remain in the context of column names, calculated s causes an ERROR
proc sql; select 'way 4' as way, id, t as s, count(*) as freq format=4. from have group way, id, calculated s;
ERROR: The following columns were not found as CALCULATED references in the immediate query: s.
My code will have to go the 1,2,3 way.
Regardless, is there a syntax modification to have the as s column in the group clause using the column name s (in other words something like calculated s that does not cause an ERROR)
You left the BY keyword out. And you don't need to include a constant variable like your new WAY in the BY group. But those are not the issue I think you are asking about.
Basically SAS is saying that just changing the name is not really making a new variable.
So you have two choices. Remove the CALCULATED keyword.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq
from have
group by id, s_new
;
If you are worried that there is possible ambiguity in the new name, then use the old name in the GROUP BY clause.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq format=4.
from have
group by id, have.t
;
Or add some none destructive operation, like TRIM() for character variables or SUM(var,.) for numeric variables, so that it will treat it as calculated.
select 'way 3' as way
, id
, trim(t) as s_new
, count(*) as freq format=4.
from have
group by id, calculated s_new
;
And if all you wanted was the change the variable name you could just use the RENAME= dataset option on the source dataset.
select 'way 3' as way
, id
, s_new
, count(*) as freq format=4.
from have(rename=(t=s_new))
group by id, s_new
;
You left the BY keyword out. And you don't need to include a constant variable like your new WAY in the BY group. But those are not the issue I think you are asking about.
Basically SAS is saying that just changing the name is not really making a new variable.
So you have two choices. Remove the CALCULATED keyword.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq
from have
group by id, s_new
;
If you are worried that there is possible ambiguity in the new name, then use the old name in the GROUP BY clause.
select 'way 3' as way
, id
, t as s_new
, count(*) as freq format=4.
from have
group by id, have.t
;
Or add some none destructive operation, like TRIM() for character variables or SUM(var,.) for numeric variables, so that it will treat it as calculated.
select 'way 3' as way
, id
, trim(t) as s_new
, count(*) as freq format=4.
from have
group by id, calculated s_new
;
And if all you wanted was the change the variable name you could just use the RENAME= dataset option on the source dataset.
select 'way 3' as way
, id
, s_new
, count(*) as freq format=4.
from have(rename=(t=s_new))
group by id, s_new
;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.