BookmarkSubscribeRSS Feed
jeremy4
Quartz | Level 8

For the following code below, what should it be "group by 1,2; " or "group by 1,2,3; " (i.e. is "ifc(b.check is null, 'No', 'Yes') as check length=3" considered as a variable to group by, or not)? Thanks!

 

Code:

data ac.dataset2;
   set ac.test;
   retain check 'Yes';
run;

 

proc sql;
   create table test_check as
   select a.account_id,
             b.default_date,
             ifc(b.check is null, 'No', 'Yes') as check length=3

    from ac.dataset1 as a
    left join ac.dataset2 as b

   on a.account_id=b.account_id

   group by 1,2;                                     /*Is it "group by 1,2; " or "group by 1,2,3; "? */

quit;

3 REPLIES 3
Kurt_Bremser
Super User

If you do not have a summary function in your SQL query, "group by" can't do anything that makes sense, so SAS (or SQL as such) will automatically convert it to an "order by" clause, and give you the respective NOTE in the log (Maxim 2: Read the Log!).

So, for clarity's sake, you should be a good coder and use the order by on your own.

If you include the third variable in this is up to you. Do you always want 'No' before 'Yes' or not?

jeremy4
Quartz | Level 8
Thanks for your help, can you give me examples of what you mean by "summary function" so that you can use "group by", as opposed to "order by" in my example?

My example has all of the records having 'Yes' for the check variable in the ac.dataset2 dataset, so that when it is left joined in the proc sql statement, there will be a comparison where account_ID's that match will receive 'Yes' for the check variable, and the answer will be 'No' when the account ID's do not match. The question I had is whether this "check" variable from the line would need to included in the "order by" statement, so do I use "order by 1,2; " or "order by 1,2,3; "?

data ac.dataset2;
set ac.test;
retain check 'Yes';
run;



proc sql;
create table test_check as
select a.account_id,
b.default_date,
ifc(b.check is null, 'No', 'Yes') as check length=3
from ac.dataset1 as a
left join ac.dataset2 as b
on a.account_id=b.account_id
order by 1,2;
quit;
Kurt_Bremser
Super User

The reference for the so-called summary functions in SQL can be found here: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=sqlproc&docsetTarget=n123...

Some illustrations:

proc sql;
create table test as
select sex, sum(weight)
from sashelp.class
group by sex;
quit;

The "archetypal" use of group by with summary functions. Group by a category variable, and do analysis on numerical values.

Now, if I omit the summary function:

proc sql;
create table test as
select sex, weight
from sashelp.class
group by sex;
quit;

I get this WARNING in the log:

27         proc sql;
28         create table test as
29         select sex, weight
30         from sashelp.class
31         group by sex;
WARNING: A GROUP BY clause has been transformed into an ORDER BY clause because neither the SELECT clause nor the optional HAVING 
         clause of the associated table-expression referenced a summary function.

and if I forget to add a further variable in the group by after adding it to the select:

proc sql;
create table test as
select sex, age, avg(weight)
from sashelp.class
group by sex;
quit;

I get this in the log:

27         proc sql;
28         create table test as
29         select sex, age, avg(weight)
30         from sashelp.class
31         group by sex;
NOTE: The query requires remerging summary statistics back with the original data.

and this result:

Sex       Age          
-----------------------
F          14  90.11111
F          12  90.11111
F          11  90.11111
F          13  90.11111
F          14  90.11111
F          15  90.11111
F          12  90.11111
F          13  90.11111
F          15  90.11111
M          16    108.95
M          12    108.95
M          14    108.95
M          12    108.95
M          15    108.95
M          14    108.95
M          13    108.95
M          11    108.95
M          15    108.95
M          12    108.95

But if I do it right:

proc sql;
create table test as
select sex, age, avg(weight)
from sashelp.class
group by sex, age;
quit;

I get this:

Sex       Age          
-----------------------
F          11      50.5
F          12     80.75
F          13        91
F          14     96.25
F          15    112.25
M          11        85
M          12     103.5
M          13        84
M          14     107.5
M          15     122.5
M          16       150

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1068 views
  • 0 likes
  • 2 in conversation