Could someone tell me what is the major difference between the two? If I hope to make my program run faster, which would be a better way to go? What factors should I consider in choosing between these two?
Thanks !
A SAS view (data step or SQL) is a program that is being executed when it's called. This is typical not beneficial from a performance point of view, but it addresses a lot of other issues, such as security, hide business logic, ease of access etc.
Compiled program can also be used for hide programming logic, In the early days of computing when CPU cycles were expensive, this could also be a way of trimming the efficiency of an application. In these days with large data sets, advanced analysis etc, I/O tend to be the bottleneck, and perhaps execution time - not time spent on compiling.
So, don't chose pre compiled data steps for performance reasons (unless your situation is very rare).
Views can be considered when the amount of resources to create redundant data (opposed to using a view) is higher than the overhead of accessing the view.
A SAS view (data step or SQL) is a program that is being executed when it's called. This is typical not beneficial from a performance point of view, but it addresses a lot of other issues, such as security, hide business logic, ease of access etc.
Compiled program can also be used for hide programming logic, In the early days of computing when CPU cycles were expensive, this could also be a way of trimming the efficiency of an application. In these days with large data sets, advanced analysis etc, I/O tend to be the bottleneck, and perhaps execution time - not time spent on compiling.
So, don't chose pre compiled data steps for performance reasons (unless your situation is very rare).
Views can be considered when the amount of resources to create redundant data (opposed to using a view) is higher than the overhead of accessing the view.
in addition to guidance recognise that a VIEW reads it's input sources every time it runs. So VIEW is great where it is used as often or less often than the source data changes. However, if your view reduces the data to fewer columns and fewer rows (for efficiency) these reductions would occur each time the view is used. That repeated "reduction" process re-reading all the rows of the source, might eliminate the I/O savings the view implies over an intermediate permanent dataset.
So I think the choice to use a view involves understanding how it is used and how often.
Another consideration about VIEWs.
When the information needed to subset the rows of the source only become available when the view is read (rather than compiled) you might expect to read the view with a WHERE. If that is your situation rather than build a data step view, try to create and use an SQL view instead, because that has a better chance of optimising the read and joins of the underlying source data tables when the view runs.
Of course there are many other issues that might demand a data step over sql and vice versa. With no such restriction I would reccommend using an SQL view in preference to DATA step view (imho)
Thanks, guys!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.