As was shown, MONOTONIC() is unreliable when used in conjunction with a HAVING clause.
By splitting the SQL into two steps, it works, but just look at this:
data Test;
do I=1 to 1e7;
output;
output;
end;
run;
data Test_first;
set Test;
by I;
if first.I;
run;
proc sql;
create table Test_monotonic as
select I, monotonic() as M
from test
;
create table test_m2 as
select i, m
from test_monotonic
group by i
having m = min(m)
order by i, m
;
quit;
Log:
73 data Test;
74 do I=1 to 1e7;
75 output;
76 output;
77 end;
78 run;
NOTE: The data set WORK.TEST has 20000000 observations and 1 variables.
NOTE: Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
real time 0.78 seconds
cpu time 0.65 seconds
79
80 data Test_first;
81 set Test;
82 by I;
83 if first.I;
84 run;
NOTE: There were 20000000 observations read from the data set WORK.TEST.
NOTE: The data set WORK.TEST_FIRST has 10000000 observations and 1 variables.
NOTE: Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
real time 2.05 seconds
cpu time 1.92 seconds
85
86 proc sql;
87 create table Test_monotonic as
88 select I, monotonic() as M
89 from test
90 ;
NOTE: Table WORK.TEST_MONOTONIC created, with 20000000 rows and 2 columns.
91 create table test_m2 as
92 select i, m
93 from test_monotonic
94 group by i
95 having m = min(m)
96 order by i, m
97 ;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.TEST_M2 created, with 10000000 rows and 2 columns.
98 quit;
NOTE: Verwendet wurde: PROZEDUR SQL - (Gesamtverarbeitungszeit):
real time 27.45 seconds
cpu time 22.22 seconds
As I added to my other post, SQL quickly deteriorates to performance worse by order of magnitudes.
Forcing SQL to do something that it is not designed to do is not a good idea.
Now, if you do something similar in explicit passthrough to a database, the results might be different, as dedicated DB systems have much more powerful optimization.
Hi Kurt,
It is such a thorough exploration. Your in-depth knowledge that helps us learn and grow.
Much appreciated!!
You probably should remove the chosen answer to your question @Suzy_Cat as this might mislead future readers.
Not too sure tbh. Maybe you have to choose another solution if that's the only way?
Thank you Chris...
Have updated the solution tag.
Learned - thanks. I'm glad you didn't take my suggestion.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.