I just spent a heck of a time debugging a SAS program today, only to discover the root cause to be the difference between SELECT DISTINCT and GROUP BY inside a proc sql procedure.
From my SQL experience, these two statements are equivalent, but I am relatively new to SAS, so I did not see this coming.
Can anyone explain why? I want my sanity back. Maybe if there's a good explanation of why SAS behaves like this, I'll take back all those nasty things I uttered. :smileydevil:
data have;
input id tt$;
cards;
1 ABADSFASDFG
2 BADSOFKJPAO
3 CAFJAPOFMAV
4 AABSADSBADS
5 AADSFAWEVAA
;
proc sql;
create table want as
select distinct(substr(tt,1,1)) as first_letter
from have;
create table why as
select substr(tt,1,1) as first_letter
from have
GROUP BY 1;
quit;
Silly me didnt read my log. it appears as though SAS treats Substr as a "summary function", like a SUM or a COUNT.
Silly me didnt read my log. it appears as though SAS treats Substr as a "summary function", like a SUM or a COUNT.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.