BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FK1
Lapis Lazuli | Level 10 FK1
Lapis Lazuli | Level 10

Hello Everyone,

 

does using the "Keep"-Option make sense at all, when quering a Database-Table? Let's assume, I have a Table named "Table1" which has 100 attributes/variables. However, I am only interested in the first variable/attribute ("var1"). The query I, therefore, use is this:

 

proc sql;
select var1, count(*) as no_obs format = comma20.
from DB2schema.Table1 (keep= var1)
group by var1;
quit;

When using extended Messaging Options

options dbidirectexec ;
options sastrace=',,,ds' sastraceloc=saslog nostsuffix;
options sql_ip_trace = ALL  msglevel=i;

I get among other information this relevant sentence:

 

SAS_SQL: Cannot handle dataset options.
SAS_SQL: Unable to convert the query to a DBMS specific SQL statement due to an error.

This, to me, suggests, that it is pointless to use the "keep"-option, since the Database cannot handle it, right?

 

Cheers,

FK

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

No, in this case the KEEP= option does no good. When you set the data in a data step, you can save some memory and processing time by using the option on the input data, but SQL statements only read the variables that you tell it to. It does no harm either, though.

 

The KEEP= option may be used with SELECT * in SQL, so that you use the option to limit the variables you want, but I would advise against it in most cases, as explicit variable lists in SQL makes the code easier to read.

 

The reason you get the messages is because the dataset option is not applicable to the DB2 database, so when you use the option SAS cannot convert the SAS SQL call to native DB2 SQL. It is quite possible that you will get better performance without the KEEP= option, as SAS will then be able to let DB2 handle the whole query.

View solution in original post

3 REPLIES 3
s_lassen
Meteorite | Level 14

No, in this case the KEEP= option does no good. When you set the data in a data step, you can save some memory and processing time by using the option on the input data, but SQL statements only read the variables that you tell it to. It does no harm either, though.

 

The KEEP= option may be used with SELECT * in SQL, so that you use the option to limit the variables you want, but I would advise against it in most cases, as explicit variable lists in SQL makes the code easier to read.

 

The reason you get the messages is because the dataset option is not applicable to the DB2 database, so when you use the option SAS cannot convert the SAS SQL call to native DB2 SQL. It is quite possible that you will get better performance without the KEEP= option, as SAS will then be able to let DB2 handle the whole query.

FK1
Lapis Lazuli | Level 10 FK1
Lapis Lazuli | Level 10

Thank you, @s_lassen , for your reply! It is the nuances, that make the difference!

Tom
Super User Tom
Super User

Are you sure the message is accurate for the query you posted?   Perhaps the message only applied to any checking that SAS did while compiling the statement to see if VAR1 actually existed on dataset TABLE1.

 

What SQL do you see in the log that SAS actually pushed into the database?  Did it push the group by or pull all of the data and group the data on the SAS side?

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 876 views
  • 0 likes
  • 3 in conversation