Traditional SAS Programming versus SQL
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
SAS versus SQL
It's a question as old as SAS itself. Especially when you think about the fact that SQL was first created by IBM back in 1975 and SAS was incorporated in 1976. Coincidence? Yes. But, it's not a coincidence that both languages have prevailed until this day. SAS, in my humble opinion, is one of the most flexible tools ever created and SQL is a “standard” that has weathered the test of time. |
SQL - Structured Query Language
|
SAS - Statistical Analysis System
|
Traditional Programming
The attached paper, that was presented at a CMS SAS Users Group (CMSSUG) meeting, describes the use of the DATA step to group and summarize information. Note how the data must first be sorted. This is not the case when using SQL. The FIRST (dot) notation is used in the DATA step to set the value of the first record in the BY group to zero. Accumulation of values occurs and the LAST record of the group is OUTPUT. SQL may be easier to follow as the COUNT and SUM functions, accompanied by the GROUP BY, perform the same function. The SUMMARY procedure may be a better choice if multiple variables and statistics are desired.
|
|
Ordered Analytical Functions
These functions enable On-Line Analytical Processing (OLAP) requests containing ANSI SQL: 2003 window functions, which greatly enhance the functionality of the Structured Query Language. This article discusses how SAS programming techniques can be imitated within a database environment using SAS/ACCESS pass-though queries to simulate By-Group processing with (dot) notation. As mentioned above, SAS does not yet support window functions for SAS data sets.
The three examples below use SAS/ACCESS to summarize Medicare claims. Both examples to the left use "Implicit" pass-through and utilize more traditional programming. In these examples, we are summarizing all of the records and placing the total onto each record in order to calculate a percent of total. Example 4 at the top left uses two DATA steps and Example 5 at the bottom left uses a derived table within an ANSI SQL:1999 style query. Example 6 on the right uses "Explicit" pass-through containing a window function. The function is initiated by the OVER clause. Notice how only one pass of the data is required.
|
|
A full explanation of these functions and many other examples can be found at the 2011 SAS Global Forum paper listed below.
Teradata® for the SAS® Programmer: Ordered Analytical Functions, Hash Objects, ANSI SQL: 2003
http://support.sas.com/resources/papers/proceedings11/019-2011.pdf
We also have a video on this topic, check it out:
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Good afternoon,
where i can find all the datasets thata are used in the examples above (e.g. table1, Mdcr_Clm) ?
Best regards,
Massimo
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
I have a bad feeling about example 4. The first obs of mdcr_clm would be overwritten with data of the last observation.
Edit: <grin> someone added the necessary keep statement. That was quick 🙂
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
That does look to be true. Thanks for pointing that out Kurt!
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Because of SAS SQL "remerging", I think example 5 could also be written as:
SELECT Bene_Id, Thru_Dt, Clm_Type, Paid_Amt, SUM(Paid_Amt) AS Total_Paid FROM Mdcr_Clm
Of course many SQL folks don't like remerging.