Hello
A far as I know in the sql original langauge
"Any column in SELECT clause that are not used by aggregate function (SUM, MIN, etc) needs to be listed in the GROUP BY
clause."
For example:
PROC SQL;
SELECT ACCT ,
count(*) as nr
FROM ACCTDAT
GROUP BY ACCT
HAVING NUMACCT GE 2;
quit;
I saw another code where the list of varaibles in SELECT are different then the list of varaibles in group BY
PROC SQL;
SELECT ACCT, FNAME, LNAME, STATE, PHONE, STATUS,
count(*) as nr
FROM ACCTDAT
GROUP BY ACCT
HAVING NUMACCT GE 2;
quit;
May anyone explain the difference?
How is it called? (Is it called re-merge??)
MAy anyone send link to read about it?
Note:
Here is the data set:
ACCT FNAME LNAME STATE PHONE STATUS
00001 KYLE RANGER FL 941-555-1234 TEST
12345 AMBER ERICKSON MN 507-012-3456 TEST
33333 ALICE ANDERSON FL 941-555-1236 CNTL
44455 HOWARD SMITH GA 706-555-0001 TEST
55666 GEORGE WILLIAMS GA 706-555-0002 CNTL
66777 NANCY BECK GA 706-555-0003 CNTL
67890 JOAN HALLEY WI 608-001-2345 TEST
76543 HEATHER SAMPSON FL 941-555-1235 TEST
77788 MARIA ALVAREZ TX 817-003-4567 TEST
88899 TINA HOLT OR 503-004-5678 CNTL
98765 KEVIN TRACEN TX 817-098-7654 TEST
99900 WALTER DENNIS MN 612-009-2345 TEST
You are perfectly right, when not all variables which are not the subject of a summary function are listed in the GROUP BY, SAS will perform a remerge and write a NOTE about that to the log. This is different from most (if not all) other SQL flavors. Other SQL flavors will need a sub-select to create the same result.
Since the variable you GROUP BY (ACCT) is unique in your example data, the remerge will be noted, but have no effect.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.