BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
RichardAD
Quartz | Level 8

Consider the common SQL task of counting the count(*) over a select group.

data have;
  length s $1 t $3; format id 1. ;
  do id = 1 to 4;
    do rep = 1 to id;
      s = cats (id);
      t = 't_' || cats (id);
      output;
    end;
  end;
run;
proc sql; select id, s, count(*) as freq format=4. from have group by id, s;
id  s  freq
-----------
 1  1     1
 2  2     2
 3  3     3
 4  4     4

Now add the twist that one of the selects is mapping a non-select column t as select column s.

So, there is now some ambiguation of what s is.  Is it the original want.s or the as s.

* s in the group is the non-select s and causes automatic remerging;
proc sql; select 'way 1' as way, id, t as s, count(*) as freq format=4. from have group way, id, s;
way    id  s        freq
------------------------
way 1   1  1 other     1
way 1   2  2 other     2
way 1   2  2 other     2
way 1   3  3 other     3
way 1   3  3 other     3
way 1   3  3 other     3
way 1   4  4 other     4
way 1   4  4 other     4
way 1   4  4 other     4
way 1   4  4 other     4

Grouping by column numbers disambiguates the role of t as s

proc sql; select 'way 2' as way, id, t as s, count(*) as freq format=4. from have 
  group 1, 2, 3;
way    id  s    freq
--------------------
way 2   1  t_1     1
way 2   2  t_2     2
way 2   3  t_3     3
way 2   4  t_4     4

Selecting t as new column s_new also disambiguates

proc sql; select 'way 3' as way, id, t as s_new, count(*) as freq format=4. from have
  group way, id, s_new;
way    id  s_new  freq
----------------------
way 3   1  t_1       1
way 3   2  t_2       2
way 3   3  t_3       3
way 3   4  t_4       4

Trying to remain in the context of column names, calculated s causes an ERROR

proc sql; select 'way 4' as way, id, t as s, count(*) as freq format=4. from have
  group way, id, calculated s;
ERROR: The following columns were not found as CALCULATED references in the immediate query: s.

My code will have to go the 1,2,3 way.

 

Regardless, is there a syntax modification to have the as s column in the group clause using the column name s (in other words something like calculated s that does not cause an ERROR)

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You left the BY keyword out.   And you don't need to include a constant variable like your new WAY in the BY group.  But those are not the issue I think you are asking about.

 

Basically SAS is saying that just changing the name is not really making a new variable.

 

So you have two choices.  Remove the CALCULATED keyword.

select 'way 3' as way
     , id
     , t as s_new
     , count(*) as freq 
from have
group by id, s_new
;

If you are worried that there is possible ambiguity in the new name, then use the old name in the GROUP BY clause.

select 'way 3' as way
     , id
     , t as s_new
     , count(*) as freq format=4. 
from have
group by id, have.t
;

Or add some none destructive operation, like TRIM() for character variables or SUM(var,.) for numeric variables, so that it will treat it as calculated.

select 'way 3' as way
     , id
     , trim(t) as s_new
     , count(*) as freq format=4. 
from have
group by id, calculated s_new
;

And if all you wanted was the change the variable name you could just use the RENAME= dataset option on the source dataset.

select 'way 3' as way
     , id
     , s_new
     , count(*) as freq format=4. 
from have(rename=(t=s_new))
group by id, s_new
;

 

View solution in original post

1 REPLY 1
Tom
Super User Tom
Super User

You left the BY keyword out.   And you don't need to include a constant variable like your new WAY in the BY group.  But those are not the issue I think you are asking about.

 

Basically SAS is saying that just changing the name is not really making a new variable.

 

So you have two choices.  Remove the CALCULATED keyword.

select 'way 3' as way
     , id
     , t as s_new
     , count(*) as freq 
from have
group by id, s_new
;

If you are worried that there is possible ambiguity in the new name, then use the old name in the GROUP BY clause.

select 'way 3' as way
     , id
     , t as s_new
     , count(*) as freq format=4. 
from have
group by id, have.t
;

Or add some none destructive operation, like TRIM() for character variables or SUM(var,.) for numeric variables, so that it will treat it as calculated.

select 'way 3' as way
     , id
     , trim(t) as s_new
     , count(*) as freq format=4. 
from have
group by id, calculated s_new
;

And if all you wanted was the change the variable name you could just use the RENAME= dataset option on the source dataset.

select 'way 3' as way
     , id
     , s_new
     , count(*) as freq format=4. 
from have(rename=(t=s_new))
group by id, s_new
;

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 615 views
  • 1 like
  • 2 in conversation