BookmarkSubscribeRSS Feed

Hi,

 

the idea is to allow for the SET statement to use observation's number(curobs) in the where clause and where=() dataset option. So the following query on the data set's engine level would be possible:

 

data out1;
  set sashelp.class curobs=co;
  where age>12 and co<10;
run;

data out2;
  set sashelp.class(curobs=co where=(age>12 and co<10));
run;

with expected output equivalent to:

data out3;
  set sashelp.class curobs=co;
  if age>12 and co<10;
  put (_ALL_) (=);
run;

The key idea is that it would be possible on data engine level, so before PDV.

 

All the best

Bart

 

 

7 Comments
ballardw
Super User

You might provide a more complicated example as what you are showing as a desired output can be accomplished with the OBS=9 option such as

 

data work.out1;
  set sashelp.class (obs=9) ;
  where age>12 ;
run;
yabwon
Onyx | Level 15

Hi @ballardw,

 

Than you for your comment 🙂

 

Thing is that according to the SAS doc ( https://documentation.sas.com/?docsetId=lrcon&docsetTarget=n1iok7aawea9zzn1aso4ulosip7l.htm&docsetVe...

and SAS behaviour the WHERE clause has precedence over OBS=.

 

When used with a WHERE expression, the values specified for OBS= and FIRSTOBS= are not the physical observation number in the data set, but a logical number in the subset. For example, obs=3 does not mean the third observation number in the data set. Instead, it means the third observation in the subset of data selected by the WHERE expression.

 

For example the code:

data test;
  do i = 1 to 10;
    output;
  end;
run;

data test2;
  set test (obs=3);
  where i > 5;
run;

will result with 6,7,8 in test2 since it first ececute teh WHERE and then apply OBS= "restriction". And the idea of "curobs in where" would allow to "restrict" observations while WHERE is executed, so the following code:

data out2;
  set test(curobs=co where=(i>5 and co<3));
run;

would return zero obs in out2.

 

All the best

Bart

 

Tom
Super User
Super User

What do you mean by current obs?  How is that to be calculated? What about datsets where some observations have already been deleted?  

yabwon
Onyx | Level 15

Hi @Tom ,

 

Thanks for your comment 🙂

 

I would like to have this new behaviour to be in line with CUROBS=

 

I would expect that the feature will honour deleted observations as in this code:

 

data test;
  do co = 1 to 10;
    x = ranuni(123);
    output;
  end;
run;

data test;
  modify test;
  if co = 5 then remove;
run;

data _null_;
  set test nobs=nobs curobs=curobs;
  put _ALL_;
run;

 the last datastep honours deleted observations.

 

 

I would expect that following code (executed on above TEST dataset):

data test2;
  set test nobs=nobs curobs=co;
  where co in (4,5,6);
  put _ALL_;
run;

would result:

nobs=10 co=4 x=0.9060333813 _ERROR_=0 _N_=1
nobs=10 co=6 x=0.2211140195 _ERROR_=0 _N_=2

All the best

Bart

 

 

 

ChrisNZ
Tourmaline | Level 20

See a discussion in

https://communities.sas.com/t5/SASware-Ballot-Ideas/Remove-the-warning-for-the-PROC-SQL-outobs-optio...

about selecting observation numbers before/after a WHERE clause.

It seems to me that you are asking for

firstobs and obs (these exist, but there should be a lastobs alias)  

and

startobs and endobs (these exist for SPDE)

data set options.

Which is entirely reasonable.

yabwon
Onyx | Level 15

Hi @ChrisNZ ,

 

Thanks for pointing out the sql discussion.

 

startobs and endobs in the BASE engine would be one of solutions which do the job, like in SPDE below:

libname growth SPDE "%sysfunc(pathname(work))";

data growth.teens;
   input Name $ Sex $ Age Height Weight;
   list;
datalines;
Alfred M 14 69.0 112.5
Carol F 14 62.8 102.5
James M 13 57.3 83.0
Janet F 15 62.5 112.5
Judy F 14 64.3 90.0
Philip M 16 72.0 150.0
Zeke M 14 71.1 105.1
Alice F 14 65.1 91.0
William M 15 66.5 112.0
Mike M 16 67.0 105.1
;

data _test_;
  if 0 then set growth.teens(keep=age);
  condition = "age = 14";
  obsnum = "5";
  declare hash H(dataset:"growth.teens(startobs=" || 
                                           obsnum || 
                                       " endobs=" || 
                                           obsnum || 
                                       " where=(" || condition || "))");
  H.defineKey("age");
  H.defineDone();

  n = H.NUM_ITEMS;
  put n=;
run;

but it is not he only way. Also an approach in which curobs= would be working in cooperation with where/where= on a data engine level would be nice one i.e. to be able to use curobs=co in the where/where= condition like in one of examples above.

 

All the best

Bart

 

 

ChrisNZ
Tourmaline | Level 20

So many ways to swing a cat... 🙂

We'll see how many of these proposals will be considered.