BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Crysis85
Obsidian | Level 7

I noticed that I can use data step options inside a proc SQL for example 

 

create table countries2 as
select * from sql.countries(drop=UNDate);

Now in the case of the WHERE option I'm wondering what's the difference between this

 

 

 

PROC SQL;
	CREATE TABLE sql_where AS
		SELECT *
		FROM essentia.olympic
		WHERE year = 2012;
QUIT;

 and this

 

PROC SQL;
	CREATE TABLE sql_where (WHERE= (year=2012)) AS
		SELECT *
		FROM essentia.olympic;
QUIT;

the resulting sets are the same so I'm thinking the difference is in efficency. Which form is better?

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Functionally they may be the same, however I would recommend not using dataset options in SQL.  There are various reasons, the two are separate languages, so using one in the other would confuse those people familiar with SQL.  Also, the code no longer becomes portable - i.e. you cant copy th code out to a pure SQL processor.  Also, if you use in conjunction with an actual database you may find it takes longer to process or compile.  I can tell you straight off there is never a need to use dataset options in SQL.  So this question is really mute.  The only time I have seen it used usefully is when people use select *, which in itself is not recommended.  You should know what data you want to pull from the table{s}, select * is both sloppy lazy programming, and could cause issues with your code.

So simply put, there is no benefit to using these, its not portable, and sloppy coding should be avoided anyways.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Functionally they may be the same, however I would recommend not using dataset options in SQL.  There are various reasons, the two are separate languages, so using one in the other would confuse those people familiar with SQL.  Also, the code no longer becomes portable - i.e. you cant copy th code out to a pure SQL processor.  Also, if you use in conjunction with an actual database you may find it takes longer to process or compile.  I can tell you straight off there is never a need to use dataset options in SQL.  So this question is really mute.  The only time I have seen it used usefully is when people use select *, which in itself is not recommended.  You should know what data you want to pull from the table{s}, select * is both sloppy lazy programming, and could cause issues with your code.

So simply put, there is no benefit to using these, its not portable, and sloppy coding should be avoided anyways.

Crysis85
Obsidian | Level 7

Thanks for the answer, I never used myself (didn't even know it was possible) but then I ended up maintaining some code that used this WHERE as a data option and asked myself if it was better than the other way. It is not apparently, good to know. Thanks again.

RahulG
Barite | Level 11

 

I need @RW9 input whether my understanding is correct that I have written down below. 

 

PROC SQL;
	CREATE TABLE sql_where AS
		SELECT *
		FROM essentia.olympic
		WHERE year = 2012;
QUIT;

Above step would restrict rows in PDV. Only those row would come in PDV that has year=2012. 

 

 

 

 

 

PROC SQL;
	CREATE TABLE sql_where (WHERE= (year=2012)) AS
		SELECT *
		FROM essentia.olympic;
QUIT;

The second approach would read every row in PDV irrespective of year =2012 and later on it will restrict while writing output dataset. Therefore this approach would definately take much longer time

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

That sounds about right to me, its a bit like putting dataset options on the set statment rather than on the data statement:

E.g.

data want;

  set have (keep= a b c);

  if d=1 ... this will fail as d is not read into PDV for processing

 

-

 

data want (keep=a b c);

  set have;

  if d=1 ... this will not fail as all variables are read into the PDV and only a b c are written out to the dataset.

 

Do note, I haven't tested or anything, just my best guess Smiley Happy

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1931 views
  • 2 likes
  • 3 in conversation