As was shown, MONOTONIC() is unreliable when used in conjunction with a HAVING clause.
By splitting the SQL into two steps, it works, but just look at this:
data Test;
do I=1 to 1e7;
output;
output;
end;
run;
data Test_first;
set Test;
by I;
if first.I;
run;
proc sql;
create table Test_monotonic as
select I, monotonic() as M
from test
;
create table test_m2 as
select i, m
from test_monotonic
group by i
having m = min(m)
order by i, m
;
quit;
Log:
73 data Test; 74 do I=1 to 1e7; 75 output; 76 output; 77 end; 78 run; NOTE: The data set WORK.TEST has 20000000 observations and 1 variables. NOTE: Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit): real time 0.78 seconds cpu time 0.65 seconds 79 80 data Test_first; 81 set Test; 82 by I; 83 if first.I; 84 run; NOTE: There were 20000000 observations read from the data set WORK.TEST. NOTE: The data set WORK.TEST_FIRST has 10000000 observations and 1 variables. NOTE: Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit): real time 2.05 seconds cpu time 1.92 seconds 85 86 proc sql; 87 create table Test_monotonic as 88 select I, monotonic() as M 89 from test 90 ; NOTE: Table WORK.TEST_MONOTONIC created, with 20000000 rows and 2 columns. 91 create table test_m2 as 92 select i, m 93 from test_monotonic 94 group by i 95 having m = min(m) 96 order by i, m 97 ; NOTE: The query requires remerging summary statistics back with the original data. NOTE: Table WORK.TEST_M2 created, with 10000000 rows and 2 columns. 98 quit; NOTE: Verwendet wurde: PROZEDUR SQL - (Gesamtverarbeitungszeit): real time 27.45 seconds cpu time 22.22 seconds
As I added to my other post, SQL quickly deteriorates to performance worse by order of magnitudes.
Forcing SQL to do something that it is not designed to do is not a good idea.
Now, if you do something similar in explicit passthrough to a database, the results might be different, as dedicated DB systems have much more powerful optimization.
Hi Kurt,
It is such a thorough exploration. Your in-depth knowledge that helps us learn and grow.
Much appreciated!!
You probably should remove the chosen answer to your question @Suzy_Cat as this might mislead future readers.
Not too sure tbh. Maybe you have to choose another solution if that's the only way?
Thank you Chris...
Have updated the solution tag.
Learned - thanks. I'm glad you didn't take my suggestion.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.