Hi @Kurt_Bremser, Do you know how to circumvent this error?
ERROR: The POINT= data set option is not valid for the data set MYORA.AUD_BIWESTIM, the data set must be randomly not sequentially accessible for POINT= processing.
Its your solution applied to an oracle load.
https://communities.sas.com/t5/forums/replypage/board-id/programming/message-id/217421
LIBNAME myora ORACLE PATH=XXXXX.vwfsag.de USER=SAS PASSWORD='XXXXXXXX';
data MKT.aki(promote=no);
set myora.AUD_BIWESTIM(obs=100000);
run;
data MKT.aki(promote=no);
do pt = nobs - 10000 to nobs;
set myora.AUD_BIWESTIM nobs=nobs point=pt;
output;
end;
stop;
run;
Where does the extra semi-colon come from. It's not in the code as posted.
And because I "stumbled" over the code while doing something else here the logic to derive a DB schema name from a SAS library definition (code for SAS libref myora as per your sample code).
proc sql noprint;
select sysvalue into :ora_schema trimmed
from dictionary.libnames
where
libname="%upcase(myora)"
and sysname='Schema/Owner'
;
quit;
proc sql;
connect using myora;
create table want as
select * from connection to myora
(
SELECT *
FROM &ora_schema..fact_ope_form
limit 10000;
)
;
disconnect from myora;
quit;
IF you've got already a SAS library defined that points to your Oracle schema then I'd be using the connect using syntax as it avoids passing explicit connection info in code - especially credentials. If there isn't such a pre-assigned SAS library then ideally use an authentication domain instead of explicit credentials.
If developing explicit pass-through SQL I normally use a two step approach:
1. First develop the DB specific pass-through SQL syntax using a DB client (i.e. SQL Developer).
2. Once DB SQL fully working "wrap" SAS around it.
This error normally indicates that the input data set has one or more of these options specified when it was created, as either a data set option or a system option:
COMPRESS=YES, POINTOBS=NO or REUSE=YES.
You cannot use the POINT= on a compressed data set.
@russt_sas wrote:
This error normally indicates that the input data set has one or more of these options specified when it was created, as either a data set option or a system option:
COMPRESS=YES, POINTOBS=NO or REUSE=YES.
You cannot use the POINT= on a compressed data set.
Since the question seems to be about access ORACLE tables this comment does not really apply.
But I wanted to make sure to let you know that POINT= works fine with compressed SAS datasets (at least for the last 20 years or so).
1992 data class(compress=yes); 1993 set sashelp.class; 1994 run; NOTE: There were 19 observations read from the data set SASHELP.CLASS. NOTE: The data set WORK.CLASS has 19 observations and 5 variables. NOTE: Compressing data set WORK.CLASS increased size by 100.00 percent. Compressed is 2 pages; un-compressed would require 1 pages. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.01 seconds 1995 1996 data want; 1997 do point=1,5,3,7 ; 1998 set class point=point; 1999 put point= name= ; 2000 output; 2001 end; 2002 stop; 2003 run; point=1 Name=Alfred point=5 Name=Henry point=3 Name=Barbara point=7 Name=Jane NOTE: The data set WORK.WANT has 4 observations and 5 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
See if you can do the selection in Oracle with Oracle means.
Otherwise, since you can't directly address a single observation (POINT=), you will have to pull the whole table to SAS and do the subset there:
data want;
set oracle.have nobs=nobs;
if _n_ > nobs - 10000;
run;
Thanks @Kurt_Bremser , but no success.
84 data MKT.miki(promote=no) ; 85 set myora.FACT_OPE_FORM nobs=nobs; 86 if _n_ > nobs - 10000; 87 _fecformo=datepart(fecformo); 88 run; NOTE: There were 1630486 observations read from the data set MYORA.FACT_OPE_FORM. NOTE: The data set MKT.MIKI has 0 observations and 191 variables. NOTE: DATA statement used (Total process time): real time 40.79 seconds cpu time 39.81 seconds
@acordes - Please be aware that reading rows from an Oracle table won't necessarily be in the order they were added. You have to define what order you want your result set in. See my other post.
It seems that while the NOBS= option is available, it returns an incorrect (too large) number. Retrieve the correct number from the Oracle equivalent of DICTIONARY.TABLES (use explicit pass-through).
You may also have an issue where the table size is reduced (records deleted) on the Oracle side while the data step runs.
What's your definition of "last obs from Oracle dataset"? To be able to read the most recently added rows from the Oracle table you will need row update date as a column in the table. If there isn't one you will be stuck. Another possible solution might be available if the table contains a sequential ID or row count.
What best works for the moment relies on certain knowledge of the table putting me in the shoes to apply an order criteria.
But that's not always the case. I use the loading for data exploration and pre-processing.
Some tables have 20 million rows and I don't know their context.
Reading all these rows to viya takes a long time.
So I suppose they are ordered already in some regard, like first-in/last-out meaning that the last observations are the most recent ones.
It's about portfolio of contracts and I'm much more interested in analyzing the last 5% of the whole table or let it be 100.000 rows.
1 %studio_hide_wrapper; 82 data MKT.miki(promote=no) ; 83 set myora.FACT_OPE_FORM (obs=10000); 84 by descending fecformo; 85 _fecformo=datepart(fecformo); 86 run; NOTE: There were 10000 observations read from the data set MYORA.FACT_OPE_FORM. NOTE: The data set MKT.MIKI has 10000 observations and 191 variables. NOTE: DATA statement used (Total process time): real time 13.73 seconds cpu time 0.39 seconds 87 88 proc summary data=mkt.miki print n; 89 class _fecformo; 90 format _fecformo year.; 91 var fk_obj; 92 run; NOTE: The CAS aggregation.aggregate action will be used to perform the initial summarization. NOTE: The PROCEDURE SUMMARY printed page 2. NOTE: PROCEDURE SUMMARY used (Total process time): real time 0.06 seconds cpu time 0.03 seconds 93 94 %studio_hide_wrapper; 105 106
You could in that case use a WHERE clause on FECFORMO to get the last few weeks or months of data.
Hello @acordes
The message is clear that oracle table is not available for random access.
I suggest that you try what @Kurt_Bremser has suggested with a minor change.
This is because it appears that your intention is to read observations after the 10000th observation.
data want;
set oracle.have nobs=nobs;
if _n_ > 10000;
run;
@Sajid01 @Kurt_Bremser I share with you the time it takes doing different approaches.
The by descending performs much faster.
However the pre-requisite is that you know a sorting criteria that makes sense.
In some case this is not given, therefore for me a valid choice is to sort by descending _all_ assuming that the oracle table has more recent observations at the end.
The summary screenshot has more weight on recent years of contract signing date and somehow (in this case) mimics a random selection. That's not a very scientific approach, but for my case at hand it's acceptable as I'm using all the oracle tables for cas actions in the context of sample data, cardinality, robust univariate statistics, outliers, distinct count, ...
I've have tried as well using fedsql for accessing the oracle table, but it takes longer.
Thanks to all for your support.
As you already found out Oracle tables are not sequential. Data is stored arbitrary and there is no concept of first row or last row in a table. If you don't request a result set to be returned in a specific order (order by ...) then the row order returned can vary between executions for the exactly same code.
The bottleneck is the data transfer between Oracle and SAS. If you've got an Oracle table with 10 million rows but only need 10000 on the SAS side then you should reduce the data volumes on the Oracle side and only transfer what you need to the SAS side.
If the only thing you care about is the number of rows you get on the SAS side then using explicit pass-through SQL with the LIMIT clause will likely outperform any other approach.
proc sql;
connect using myora;
create table want as
select * from connection to myora
(
SELECT *
FROM <schema name>.fact_ope_form
limit 10000;
)
;
disconnect from myora;
quit;
If you don't know the Oracle schema name then one way to get it easily is to execute below which will print the library definition with the schema name to the SAS log.
libname myora list;
You could of course also query the SAS dictionary tables to retrieve the schema name dynamically.
....and in case this needs to execute in CAS then look-up how explicit pass-through SQL works for CAS. You could certainly not use Proc SQL. Proc FedSQL should work or there are eventually also some CAS actions sets for this.
What makes me think you're using SAS Viya CAS is your usage of a by statement without a Proc Sort.
If that's what you're doing then add the following options right before your data step.
OPTIONS SASTRACE=',,,d' SASTRACELOC=SASLOG NOSTSUFFIX;
These options will print into the SAS log the SQL that SAS generates and sends to Oracle for execution. If you're lucky then CAS uses the data set OBS= option and translates it into a Oracle SQL limit clause. If that's the case then just use the data step you've got without a by statement.
Yes, I'm using CAS.
I a different project I use the proc sql to insert new rows into an existing oracle table. So I made it work.
OPTIONS SASTRACE=',,,d' SASTRACELOC=SASLOG NOSTSUFFIX;
proc sql;
connect to oracle as dbcon
(user=SAS password='xxxxxxx' buffsize=100
PATH=xxxxx.vwfsag.de);
create table PUBLIC.FACT_OPE_FORM as select *
from connection to dbcon
(select *
from SAS.FACT_OPE_FORM
limit 10000;
)
;
disconnect from myora;
quit;
Running your code gives an error:
ORACLE_2: Prepared: on connection 1 select FECFORMO from SAS.FACT_OPE_FORM; limit 10000; ERROR: ORACLE prepare error: ORA-00933: SQL command not properly ended. SQL statement: select FECFORMO from SAS.FACT_OPE_FORM; limit 10000;. NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements. 94 disconnect from myora; NOTE: Statement not executed due to NOEXEC option. 95 quit;
@acordes wrote:
Yes, I'm using CAS.
I a different project I use the proc sql to insert new rows into an existing oracle table. So I made it work.
OPTIONS SASTRACE=',,,d' SASTRACELOC=SASLOG NOSTSUFFIX; proc sql; connect to oracle as dbcon (user=SAS password='xxxxxxx' buffsize=100 PATH=xxxxx.vwfsag.de); create table PUBLIC.FACT_OPE_FORM as select * from connection to dbcon (select * from SAS.FACT_OPE_FORM limit 10000; ) ; disconnect from myora; quit;
Running your code gives an error:
ORACLE_2: Prepared: on connection 1 select FECFORMO from SAS.FACT_OPE_FORM; limit 10000; ERROR: ORACLE prepare error: ORA-00933: SQL command not properly ended. SQL statement: select FECFORMO from SAS.FACT_OPE_FORM; limit 10000;. NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements. 94 disconnect from myora; NOTE: Statement not executed due to NOEXEC option. 95 quit;
Did you add the extra semi-colon? Or did VIYA insert it?
Shouldn't you use PROC FEDSQL with VIYA instead of PROC SQL?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.