BookmarkSubscribeRSS Feed

Here's a quote from https://communities.sas.com/t5/SAS-Programming/Finding-a-name-in-a-text-field/m-p/539662#M148737:

 

BTW: You do realize that function monotonic() is really useful but also not documented and though not supported - which means not production worthy. 

 

Which begs a question(s) I've had for some time...

 

1) How do I create an incrementing row number in PROC SQL production code?

 

2) Monotonic() is undocumented, has been undocumented for a lot of years, most folks in the SAS community know about it even though undocumented, and probably many folks are using it in production code.  

 

3) Clearly there is a need for this functionality.  SQL Server, Oracle, Postgres, and MySQL (the only ones I checked) all have the ROW_NUMBER() windowing function for this functionality.  So how about SAS R&D renames monotonic() to row_number() (or better yet alias monotonic() to row_number() or vice versa, so existing "production" code doesn't break), run it through QA, and officially support this needed functionality?

 

4) ROW_NUMBER() in the above databases is more flexible than monotonic() since it operates over the windowing (grouping) columns (analogous to first. processing in the data step).  It would be nice if the final production solution also supported such functionality.

 

10 Comments
novinosrin
Tourmaline | Level 20

+1

 

Which means I have someone who also share something to similar to this

 

https://communities.sas.com/t5/Have-Your-Say/Isn-t-it-time-for-SAS-to-implement-ROW-NUMBER-OVER-PART...

 

 

ScottBass
Rhodochrosite | Level 12

Sorry didn't see that one before posting here (I did search 🙂 )

novinosrin
Tourmaline | Level 20

@ScottBass  Sir, Sounds funny when you say sorry lol Come on when did you turn to sound so formal haha 

 

Same banter as our linkedin except that can't be so open as big brother(moderator) is watching haha 

 

Have a good day Sir

Kurt_Bremser
Super User

Given the fact that the data step seems to be relegated to "second-tier" in SAS learning (see the recent discussion in https://communities.sas.com/t5/Programming-1-and-2/Programming-1-2-content-removed-INFILE-and-added-...), such basic functionality HAS to be implemented in SQL.

 

If SAS doesn't want the average programmer to use the data step as their #1 go-to tool any longer, then at least give them this tool.

RW9
Diamond | Level 26
Diamond | Level 26

This would however raise several questions.  SAS currently provides three options: ANSI SQL in proc sql, ANSI SQL with SAS functions in proc sql, and ANSI SQL passthrough.  ROW_NUMBER or the likes of that are not ANSI SQL, and hence would be implemented on a platform specific basis.  Therefore, you would need the function to conform internally to SAS, and Externally to each database they provide access to.  Now with the first two options this is possible, however the third when the code is passed through, then this becomes an increasingly difficult ask as there is no common platform for such functions.  First two DB's I checked: Access, and SQLServer have the same syntax in parts for row_number(), however once you look further than the surface you can see that SQLServer supports order by and partition by, where Access does not implement partition by.  

Now I have no problem adding new functionality, however I can see adding third party software functions like this merely causing more complications, such as different code for different databases.

I agree with @Kurt_Bremser in that the general move away from Base SAS is quite a sad and strange move for SAS.  If I was coding SQL a lot, creating production code I would not be doing this in SAS but using something specifically designed for the task.  I would also really be evaluating my whole process and trying to compartmentalize applications and code bases.

ScottBass
Rhodochrosite | Level 12

To be clear:  I know and love the data step.  But as with many things in SAS, "there is more than one way to do it" (borrowed from Perl).

 

If I am programming a data step, there is no need for this.  If I am programming SQL, I currently use the monotonic() function, because that's all I've got, and I need this functionality. 

 

(Random musing:  I wonder why R&D created monotonic() anyway?  And how did it become known, if not shared by SAS at some point?)

 

Holding the ANSI SQL standard against PROC SQL may be a bit "rich".  AFAIK, there are a number of areas where it is a superset of ANSI (although I don't profess to be knowledgeable on ANSI SQL standard).  One that comes to mind is the remerging of summary statistics onto the target table; I can't do this in SQL Server, which requires a sub-query or common table expression.

 

In short(ish), I think it's time SAS implements monotonic() or row_number() in a supported fashion.

 

And yeah, it's sad if SAS is de-emphasizing the data step.  Time will tell if that's a good marketing move or not (although in reality it's just the programming course - it's not like SAS is deprecating the data step!!!)

 

Now with the first two options this is possible, however the third when the code is passed through, then this becomes an increasingly difficult ask as there is no common platform for such functions.

 

I don't see it this way.  If the code is implicit passthrough, then SAS's ROW_NUMBER() would not be pushed through to the database, ala put, input, or other SAS functions.  It would be applied to the returned result set.  And if the code is explicit passthrough, then the code needs to be compliant with the target database.  If it implements ROW_NUMBER, cool.  If not, boom.

 

Finally, I found this:  

https://stackoverflow.com/questions/202245/pure-sql-technique-for-auto-numbering-rows-in-result-set

Tom
Super User
Super User

If you don't want to use MONOTONIC() you can roll your own function using macro code and the RESOLVE() function.

proc sql;
%let i=0;
select monotonic() as n1
     , input(resolve('%let i=%eval(&i+1);&i'),32.) as n2
     , *
from sashelp.class
;
quit;
lc_isp
Obsidian | Level 7

@ScottBass wrote:

Finally, I found this:  

https://stackoverflow.com/questions/202245/pure-sql-technique-for-auto-numbering-rows-in-result-set


I'm no one as SQL expert, and since very few time (1 week) into SAS, nonetheless it's a little funny to see how doing a ton of subqueries [1] seems "acceptable solution", and "just having a row_number (which DBMS sure knows) into a variable or as SQL addition" seems not.

 

[1] https://www.sqlteam.com/articles/returning-a-row-number-in-a-query

 

P.S.

I know that sub-querying into (cache) memory is a lot faster than doing it into main storage, nonetheless "just exposing" a data (row_number) DBMS system already knows seems still better than adding a ton of (in-memory) subqueries.

Tom
Super User
Super User

nonetheless "just exposing" a data (row_number) DBMS system already knows seems still better than adding a ton of (in-memory) subqueries.

The issue is way more complicated than that. 

  • Even if you could know the position of the data in a physical dataset what about logically delete observations in the dataset?
  • The whole concept of the SQL language is designed to treat data as sets.
  • What about views
  • What about sub queries
  • What about parallel processing
  • ....
lc_isp
Obsidian | Level 7

    Hi Tom,

 

 as I told, I'm extremely green to the whole SAS (concepts, DBMS, programming), so I can't address your points.

 Only thing I can see, generally speaking, is that people have problems and, to solve them (as I did too), is using an undocumented function which may potentially lead to even more problems (e.g. using it in production environment, where the function itself may change unpredictably in the future).

 

 That said, I'm sure SAS labs are not playing with customers: if they didn't release a viable and stable function to do that it's probably 'cause solving the points you evidenced is not that simple.

 

 At this point, the undocumented function probably saves both: we customers have a needed tool, but they have no responsability about it's use.

 

P.S.

sorry for the late answer, I wasn't at office.