Help using Base SAS procedures

Data step options in PROC SQL

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

Data step options in PROC SQL

[ Edited ]

I noticed that I can use data step options inside a proc SQL for example 

 

create table countries2 as
select * from sql.countries(drop=UNDate);

Now in the case of the WHERE option I'm wondering what's the difference between this

 

 

 

PROC SQL;
	CREATE TABLE sql_where AS
		SELECT *
		FROM essentia.olympic
		WHERE year = 2012;
QUIT;

 and this

 

PROC SQL;
	CREATE TABLE sql_where (WHERE= (year=2012)) AS
		SELECT *
		FROM essentia.olympic;
QUIT;

the resulting sets are the same so I'm thinking the difference is in efficency. Which form is better?

 


Accepted Solutions
Solution
‎10-13-2016 06:05 AM
Super User
Super User
Posts: 7,942

Re: Data step options in PROC SQL

Functionally they may be the same, however I would recommend not using dataset options in SQL.  There are various reasons, the two are separate languages, so using one in the other would confuse those people familiar with SQL.  Also, the code no longer becomes portable - i.e. you cant copy th code out to a pure SQL processor.  Also, if you use in conjunction with an actual database you may find it takes longer to process or compile.  I can tell you straight off there is never a need to use dataset options in SQL.  So this question is really mute.  The only time I have seen it used usefully is when people use select *, which in itself is not recommended.  You should know what data you want to pull from the table{s}, select * is both sloppy lazy programming, and could cause issues with your code.

So simply put, there is no benefit to using these, its not portable, and sloppy coding should be avoided anyways.

View solution in original post


All Replies
Solution
‎10-13-2016 06:05 AM
Super User
Super User
Posts: 7,942

Re: Data step options in PROC SQL

Functionally they may be the same, however I would recommend not using dataset options in SQL.  There are various reasons, the two are separate languages, so using one in the other would confuse those people familiar with SQL.  Also, the code no longer becomes portable - i.e. you cant copy th code out to a pure SQL processor.  Also, if you use in conjunction with an actual database you may find it takes longer to process or compile.  I can tell you straight off there is never a need to use dataset options in SQL.  So this question is really mute.  The only time I have seen it used usefully is when people use select *, which in itself is not recommended.  You should know what data you want to pull from the table{s}, select * is both sloppy lazy programming, and could cause issues with your code.

So simply put, there is no benefit to using these, its not portable, and sloppy coding should be avoided anyways.

Contributor
Posts: 33

Re: Data step options in PROC SQL

Thanks for the answer, I never used myself (didn't even know it was possible) but then I ended up maintaining some code that used this WHERE as a data option and asked myself if it was better than the other way. It is not apparently, good to know. Thanks again.

Super Contributor
Posts: 266

Re: Data step options in PROC SQL

 

I need @RW9 input whether my understanding is correct that I have written down below. 

 

PROC SQL;
	CREATE TABLE sql_where AS
		SELECT *
		FROM essentia.olympic
		WHERE year = 2012;
QUIT;

Above step would restrict rows in PDV. Only those row would come in PDV that has year=2012. 

 

 

 

 

 

PROC SQL;
	CREATE TABLE sql_where (WHERE= (year=2012)) AS
		SELECT *
		FROM essentia.olympic;
QUIT;

The second approach would read every row in PDV irrespective of year =2012 and later on it will restrict while writing output dataset. Therefore this approach would definately take much longer time

 

Super User
Super User
Posts: 7,942

Re: Data step options in PROC SQL

That sounds about right to me, its a bit like putting dataset options on the set statment rather than on the data statement:

E.g.

data want;

  set have (keep= a b c);

  if d=1 ... this will fail as d is not read into PDV for processing

 

-

 

data want (keep=a b c);

  set have;

  if d=1 ... this will not fail as all variables are read into the PDV and only a b c are written out to the dataset.

 

Do note, I haven't tested or anything, just my best guess Smiley Happy

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 462 views
  • 2 likes
  • 3 in conversation