For the following code below, what should it be "group by 1,2; " or "group by 1,2,3; " (i.e. is "ifc(b.check is null, 'No', 'Yes') as check length=3" considered as a variable to group by, or not)? Thanks!
Code:
data ac.dataset2;
set ac.test;
retain check 'Yes';
run;
proc sql;
create table test_check as
select a.account_id,
b.default_date,
ifc(b.check is null, 'No', 'Yes') as check length=3
from ac.dataset1 as a
left join ac.dataset2 as b
on a.account_id=b.account_id
group by 1,2; /*Is it "group by 1,2; " or "group by 1,2,3; "? */
quit;
If you do not have a summary function in your SQL query, "group by" can't do anything that makes sense, so SAS (or SQL as such) will automatically convert it to an "order by" clause, and give you the respective NOTE in the log (Maxim 2: Read the Log!).
So, for clarity's sake, you should be a good coder and use the order by on your own.
If you include the third variable in this is up to you. Do you always want 'No' before 'Yes' or not?
The reference for the so-called summary functions in SQL can be found here: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=sqlproc&docsetTarget=n123...
Some illustrations:
proc sql;
create table test as
select sex, sum(weight)
from sashelp.class
group by sex;
quit;
The "archetypal" use of group by with summary functions. Group by a category variable, and do analysis on numerical values.
Now, if I omit the summary function:
proc sql;
create table test as
select sex, weight
from sashelp.class
group by sex;
quit;
I get this WARNING in the log:
27 proc sql; 28 create table test as 29 select sex, weight 30 from sashelp.class 31 group by sex; WARNING: A GROUP BY clause has been transformed into an ORDER BY clause because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function.
and if I forget to add a further variable in the group by after adding it to the select:
proc sql;
create table test as
select sex, age, avg(weight)
from sashelp.class
group by sex;
quit;
I get this in the log:
27 proc sql; 28 create table test as 29 select sex, age, avg(weight) 30 from sashelp.class 31 group by sex; NOTE: The query requires remerging summary statistics back with the original data.
and this result:
Sex Age ----------------------- F 14 90.11111 F 12 90.11111 F 11 90.11111 F 13 90.11111 F 14 90.11111 F 15 90.11111 F 12 90.11111 F 13 90.11111 F 15 90.11111 M 16 108.95 M 12 108.95 M 14 108.95 M 12 108.95 M 15 108.95 M 14 108.95 M 13 108.95 M 11 108.95 M 15 108.95 M 12 108.95
But if I do it right:
proc sql;
create table test as
select sex, age, avg(weight)
from sashelp.class
group by sex, age;
quit;
I get this:
Sex Age ----------------------- F 11 50.5 F 12 80.75 F 13 91 F 14 96.25 F 15 112.25 M 11 85 M 12 103.5 M 13 84 M 14 107.5 M 15 122.5 M 16 150
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.