BookmarkSubscribeRSS Feed
Kaushik2
Fluorite | Level 6

Hi Team,

 

Lets say I have source table of 10 million records.


My sql query / data step using defined libname statement has retrieved the result set with 1 million records.

 

I have imposed inobs=1, outobs=1 (proc sql) / obs=1 (data step) and so, my output dataset created has only 1 record.

 

When we enable trace log on a data step with obs=5 in set statement , but still see that internal select query seems created with no filters.

 

Data tgt.arrg_5 ;


ODBC_1: Prepared: on connection 0
SELECT * FROM MYDB.DWH_ARRG

set source.DWH_ARRG(obs=5);
run;

ODBC_1: Prepared: on connection 0
SELECT * FROM MYDB.DWH_ARRG

set source.DWH_ARRG(obs=5);
run;


ODBC_2: Executed: on connection 0
Prepared statement ODBC_1


My query:

Is the result set pulled into SAS environment ? Is it stored somewhere ?
I do not want entire result set to be pulled into SAS environment when I impose obs=1 / inobs=1
Kindly help to explain the processing at the background.

12 REPLIES 12
Kaushik2
Fluorite | Level 6

Below is the detail after enabling trace log :

 

1 options sastrace=',,,ds' sastraceloc=saslog nostsuffix;
2
3
4 libname tgt "/abc/xyz";
NOTE: Libref TGT was successfully assigned as follows:
Engine: V9
Physical Name: /abc/xyz
ODBC: AUTOCOMMIT is YES for connection 0
5 libname bq ODBC DSN="xxx" schema="TD";
NOTE: Libref BQ was successfully assigned as follows:
Engine: ODBC
Physical Name: GoogleBQ
6 Data tgt.arrg_5 ;


ODBC_1: Prepared: on connection 0
SELECT * FROM TD.DWH_ARRG

7 set bq.DWH_ARRG(obs=5);
8 run;
ODBC_2: Executed: on connection 0
Prepared statement ODBC_1

NOTE: There were 5 observations read from the data set BQ.DWH_ARRG.

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

don't play us stupid this can not have happened in SAS.

@Kaushik2 wrote as you can see @Kaushik2 line 6 and line 7 of the log have information included that don't happen is a data step.

6 Data tgt.arrg_5 ;


ODBC_1: Prepared: on connection 0
SELECT * FROM TD.DWH_ARRG

7 set bq.DWH_ARRG(obs=5);
Kurt_Bremser
Super User

The obs=5 (a pure SAS dataset option) is not handed over to the view that's behind the ODBC, so it reads everything and transmits it to SAS, where only 5 observations are taken.

My .02 $

Timmy2383
Lapis Lazuli | Level 10
Why would someone "play us stupid"? Seems like an unnecessary accusation considering they are looking for help.
@Kaushik2 provided the log output which, with the SASTRACE options they have turned on, shows the SQL that is passed to the DBMS by the SAS/ACCESS engine.

I believe what they are wanting to know is, why is the SQL being passed to the DBMS not showing something equivalent to "LIMIT 5" or "TOP 5", whatever the DBMS equivalent is for OBS=5
Reeza
Super User
I agree that this person isn't likely 'playing stupid', but the code also doesn't make sense nor is the question clear.

1. You cannot mix SQL and data step statements the way you are. Is this your actual code? What are you expecting to occur?

2. What are you trying to do?

3. Or what are you trying to understand?
Tom
Super User Tom
Super User

Can you tell from the performance of the query whether SAS pulled over all of the records and then subset to just the first 5?

 

It is possible that the traces that SAS shows do not include commands that would limit the number of records returned from a query.

Kaushik2
Fluorite | Level 6

log.JPG

@Kurt_Bremser @Tom @Reeza @Timmy2383 @VDD 

 

  • Trying to create a subset from external database source (like DB2, GoogleBQ)

 

  • My data step is as below. No mixing done. We see SQL lines  because the tracelog is enabled. Please refer screenshot for details.

 

                    Data tgt.arrg_5 ; set bq.DWH_ARRG(obs=5); run; 

 

 

  • Yes, my query is why SQL displayed in the trace log does not show "LIMIT 5" or "TOP 5", whatever the DBMS equivalent is for OBS=5. My screenshot contains similar trace log both for OBS=5 and without it. How to explain to Non-SAS people that why trace log does not show the difference.

 

 

  • Also, how to explain to Non-SAS people on the below statement as the trace log does not depict it. We are dealing with source tables with huge number of records.

"It is possible that the traces that SAS shows do not include commands that would limit the number of records returned from a    query."

 

  • Need help to prove that the intermediate result set (say 50 records) from the submitted query on DB source table( say having 100 records) does not contain “huge” number of records being pulled into SAS Environment and that only 5 observations are being “output” in SAS dataset.

 

Kurt_Bremser
Super User

As a pure technical option, you could monitor network traffic between the SAS and DBMS servers to see how much data is passed.

And it should be possible to log the queries on the DBMS side.

And I would submit this whole issue to SAS technical support; you might have found something that is not really a bug, but suboptimal at least.

 

PS is DWH_ARRG a table or a view in the DBMS?

Kaushik2
Fluorite | Level 6

@Kurt_Bremser    DWH_ARRG is a table in DBMS.

Kurt_Bremser
Super User

If you can verify that the DBMS reads the whole table (and from your previous post I guess we can assume that), then this is (IMO) a problem for SAS Technical Support. It's not a big thing (as the end result will be correct), but it's a waste of time and resources with large DBMS tables.

JackHamilton
Lapis Lazuli | Level 10
It doesn't have to pass a restriction to the database; all it has to do is to stop when it has enough records. How many records it brings back and then throws away is likely to depend on buffer sizes and other options.
My experience with Oracle is that it does not read the entire data set when you have specified OBS=.
Something I would like to know is how to discover how many records were transferred through SAS/Access. If you are reading a partitioned table and specify SASTRACE=',,t,' you will get a display in the log showing the number of records processed by each thread. But not if it's a non-partitioned table.


suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 1904 views
  • 2 likes
  • 7 in conversation