As was shown, MONOTONIC() is unreliable when used in conjunction with a HAVING clause.
By splitting the SQL into two steps, it works, but just look at this:
data Test;
do I=1 to 1e7;
output;
output;
end;
run;
data Test_first;
set Test;
by I;
if first.I;
run;
proc sql;
create table Test_monotonic as 
  select I, monotonic() as M
  from test 
;
create table test_m2 as
  select i, m
  from test_monotonic
  group by i
  having m = min(m)
  order by i, m
;
quit;Log:
 73         data Test;
 74         do I=1 to 1e7;
 75         output;
 76         output;
 77         end;
 78         run;
 
 NOTE: The data set WORK.TEST has 20000000 observations and 1 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.78 seconds
       cpu time            0.65 seconds
       
 
 79         
 80         data Test_first;
 81         set Test;
 82         by I;
 83         if first.I;
 84         run;
 
 NOTE: There were 20000000 observations read from the data set WORK.TEST.
 NOTE: The data set WORK.TEST_FIRST has 10000000 observations and 1 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           2.05 seconds
       cpu time            1.92 seconds
       
 
 85         
 86         proc sql;
 87         create table Test_monotonic as
 88           select I, monotonic() as M
 89           from test
 90         ;
 NOTE: Table WORK.TEST_MONOTONIC created, with 20000000 rows and 2 columns.
 
 91         create table test_m2 as
 92           select i, m
 93           from test_monotonic
 94           group by i
 95           having m = min(m)
 96           order by i, m
 97         ;
 NOTE: The query requires remerging summary statistics back with the original data.
 NOTE: Table WORK.TEST_M2 created, with 10000000 rows and 2 columns.
 
 98         quit;
 NOTE:  Verwendet wurde: PROZEDUR SQL - (Gesamtverarbeitungszeit):
       real time           27.45 seconds
       cpu time            22.22 seconds
As I added to my other post, SQL quickly deteriorates to performance worse by order of magnitudes.
Forcing SQL to do something that it is not designed to do is not a good idea.
Now, if you do something similar in explicit passthrough to a database, the results might be different, as dedicated DB systems have much more powerful optimization.
Hi Kurt,
It is such a thorough exploration. Your in-depth knowledge that helps us learn and grow.
Much appreciated!!
You probably should remove the chosen answer to your question @Suzy_Cat as this might mislead future readers.
Not too sure tbh. Maybe you have to choose another solution if that's the only way?
Thank you Chris...
Have updated the solution tag.
Learned - thanks. I'm glad you didn't take my suggestion.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
