I just spent a heck of a time debugging a SAS program today, only to discover the root cause to be the difference between SELECT DISTINCT and GROUP BY inside a proc sql procedure.
From my SQL experience, these two statements are equivalent, but I am relatively new to SAS, so I did not see this coming.
Can anyone explain why? I want my sanity back. Maybe if there's a good explanation of why SAS behaves like this, I'll take back all those nasty things I uttered. :smileydevil:
data have;
input id tt$;
cards;
1 ABADSFASDFG
2 BADSOFKJPAO
3 CAFJAPOFMAV
4 AABSADSBADS
5 AADSFAWEVAA
;
proc sql;
create table want as
select distinct(substr(tt,1,1)) as first_letter
from have;
create table why as
select substr(tt,1,1) as first_letter
from have
GROUP BY 1;
quit;
Silly me didnt read my log. it appears as though SAS treats Substr as a "summary function", like a SUM or a COUNT.
Silly me didnt read my log. it appears as though SAS treats Substr as a "summary function", like a SUM or a COUNT.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.