BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CurtisMackWSIPP
Lapis Lazuli | Level 10

Hi all,

Right now I think this could be a bug.  However, it could be me not understanding type conversion in the FedSQL system or limitation of the RAND function.  Please consider this code.  I have condensed this down from something more complicated, so there is no obvious use case for this example:

 

proc ds2;
   package mytools / overwrite =yes;

    method mymethod(Double Value,Double rand) returns double;
      declare double returnVal;
      returnVal = Value;
      if (rand > .5) then returnVal = 0;
      return (returnVal);
    end;
  run;
run;
quit;

data have(drop=y);
  do a = 1 to 50; 
    r = rand('UNIFORM'); 
    x = (a / 5000);
    output; 
  end;
  x = 0;
  do y = 1 to 5; 
    r = rand('UNIFORM'); 
    output; 
  end;
run;

proc fedsql;
  drop table want_rand force;
  drop table want_fixed force;

  create table want_rand as
  select * from
  (select x,r,mytools.mymethod(x,rand('UNIFORM')) as t
     from have) as w
   where t > 0;

  create table want_fixed as
  select * from
  (select x,r,mytools.mymethod(x,r) as t
     from have) as w
   where t > 0;
quit;

data want_rand;
  set want_rand;
  format t 10.7;
run;

What I can't explain is the rows where t = 0 in the want_rand table.  The query is supposed to only select values where it is > 0.  The want_fixed table does not have these t = 0 rows even though the only difference between these queries is passing the value of the variable "r" which is predefined as opposed to passing a randomly generated number directly.  The method should be setting about 50% of the values to 0.  Both of the queries returns about the same amount of rows, even though the want_rand table has several 0 values for t.  I interpret this to mean that the query is subsetting correctly, but the conversion from FedSql output to a SAS dataset is rounding them to 0 after the fact.  However there seems to be no correlation to the size of x and those with t=0, so rounding due to type conversion doesn't explain it very well.  Notice that the method does not use the value of "rand" to do any calculation, it is only used as a condition.  So how would the different values passed change the datatype or formatting of the output variable at all?  The final data set id just to assign a more detailed numeric format to "t" and rule out this being a display format issue.

 

I tried moving the call to RAND inside the DS2 method and got the same result as when passing it as a parameter.  So could this have something to do with calling RAND in the FedSQL environment?  The same exact logic inside a FCMP function called in PROC SQL works as expected.

 

So, like I said, I'm about to submit this to tech support.  I am hoping one of you will educate me as to why this is expected behavior.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
CurtisMackWSIPP
Lapis Lazuli | Level 10

Tech support has found that this bug is limited in scope and there is a work around (generating the random value outside of the FedSQL query).  They have therefore decided "the fix to be included in a future release cycle".

View solution in original post

9 REPLIES 9
ballardw
Super User

Try using a format that shows more decimals.

 

When you use a format that specifies the number of decimals such as

 format t 10.7;

then any value that has zeroes to 7 places gets rounded and if the 8th decimal place is not 5 or greater you see 0 as a result.

Run this an look in the log for the result. Different format, different appearance.

data example;
    x= 0.000000009;
    put x=10.7;
    put x=10.8;
    put x=11.9;
    put x=best16.;
run;

Moral of the story: if you a concerned about a specific level of precision then either control the values with explicit rounding or use a format long enough to display the digits of interest. Also keep in mind the precision stored in the variables. As soon as you move to the data step you are back to 8 bit precision.

CurtisMackWSIPP
Lapis Lazuli | Level 10

Where would I put that format in FedSQL?  

CurtisMackWSIPP
Lapis Lazuli | Level 10

And the method works fine if a value is passed from a variable instead of directly from the RAND function.  If this were a truncation issue, it would be the opposite.  If I put the RAND function directly into the method the same issue arises. 

CurtisMackWSIPP
Lapis Lazuli | Level 10

To further rule out this being a formatting, rounding, or type conversion issue, consider the following code where the value is converted to a string using the 10.7 format before it is returned.  The query still returns rows where the value of "t" = '0' even though the query says "where t != '0'" when the value of RAND is passes but not when the fixed value is passed.

 

proc ds2;
   package mytools / overwrite =yes;

    method mymethod(Double Value,Double rnd) returns char(32);
      declare double returnVal r z;
      declare char(32) returnStr;
      returnVal = Value;
      if (rnd > .5) then do;
        returnVal = 0;
      end;
      returnStr = put(returnVal,10.7);
      return (returnVal);
    end;
  run;
run;
quit;

data have(drop=y);
  do a = 1 to 50; 
    r = rand('UNIFORM'); 
    x = (a / 5000);
    output; 
  end;
  x = 0;
  do y = 1 to 5; 
    r = rand('UNIFORM'); 
    output; 
  end;
run;

proc fedsql;
  drop table want_rand force;
  drop table want_fixed force;

  create table want_rand as
  select * from
  (select x,r,mytools.mymethod(x,rand('UNIFORM')) as t
     from have) as w
   where t != '0';

  create table want_fixed as
  select * from
  (select x,r,mytools.mymethod(x,r) as t
     from have) as w
   where t != '0';
quit;

Tom
Super User Tom
Super User

Since your function is not returning the string variable its value will just disappear when the function call ends.

      returnStr = put(returnVal,10.7);
      return (returnStr);
CurtisMackWSIPP
Lapis Lazuli | Level 10

Good catch, I should have seen that.  It does not address my original problem however.  Here is a new version of the code that uses whole numbers so formatting and rounding are not as likely to be the culprits.  The WANT_RAND table still has the unexplained values of 0.

 

proc ds2;
   package mytools / overwrite =yes;

    method mymethod(Double Value,Double rand) returns double;
      declare double returnVal;
      returnVal = Value;
      if (rand > .5) then returnVal = 0;
      return (returnVal);
    end;
  run;
run;
quit;

data have(drop=y);
  do a = 1 to 50; 
    r = rand('UNIFORM'); 
    x = a ;
    output; 
  end;
  x = 0;
  do y = 1 to 5; 
    r = rand('UNIFORM'); 
    output; 
  end;
run;

proc fedsql;
  drop table want_rand force;
  drop table want_fixed force;

  create table want_rand as
  select * from
  (select x,r,mytools.mymethod(x,rand('UNIFORM')) as t
     from have) as w
   where t > 0;

  create table want_fixed as
  select * from
  (select x,r,mytools.mymethod(x,r) as t
     from have) as w
   where t > 0;
quit;
CurtisMackWSIPP
Lapis Lazuli | Level 10

I'm following up on my own thread for documentation purposes.

 

I submited this to SAS tech support.  After some back and forth, they have come to the conclusion that this is likely a bug.

 

This is very concerning given that no errors are generated and the results are only partially incorrect.  I could have easily missed this behavior if I hadn’t been subsetting for rows where the value was > 0.  If I hadn’t been, I may not have noticed that too many values were being set to 0. 

 

In testing this they told me about the importance of including the IPTRACE option on any FedSQL calls.  This function gives more information on how the query performs.  I will report errors that aren't being propagated to the log.  When using this option, any call to a DS2 method from a FedSQL query returns the error "ERROR: [42000] Syntax error or access violation".  This error happened even in the very simple method below.

 

proc ds2;
   package mytools / overwrite =yes;
    method mymethod(Double Value) returns double;
      return (Value);
    end;
  run;
run;
quit;

 

CurtisMackWSIPP
Lapis Lazuli | Level 10

An update from tech support.

 

... there is a bug how PROC FedSQL is handling the DS2 code which is causing you to see output that is incorrect (T=0).   R&D is looking into it ...

CurtisMackWSIPP
Lapis Lazuli | Level 10

Tech support has found that this bug is limited in scope and there is a work around (generating the random value outside of the FedSQL query).  They have therefore decided "the fix to be included in a future release cycle".

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 839 views
  • 0 likes
  • 3 in conversation