Solved: use Array in a proc fcmp function

rhapsody · Posted 02-06-2019 07:38 AM

I have the table Bhat which has the variable theta according to

theta1 theta2 theta3

0.2 0.5 0

0.3 0.6 0.023

0.5 0.1 0.05

0.7 0.2 0.01

I want to use a function which knows it's an Array and takes the dimensions of the Array automatically into account. Below is what I've done so far but it doesn't work. The function I've done is called "add". Any ideas?

proc fcmp outlib=work.myfuncs.dates;

function add(theta);

trans1 = theta1;

do i=1 to dim(theta)-1;

trans(i+1) = theta(i+1) - theta(i);

end;

return(trans);

endsub;

run; quit;

/* set search path for custom functions */

options cmplib=work.myfuncs;

/* test with data set of values */

data test;

set Bhat;

/*array theta(*) theta1-theta2;*/

result=add(theta);

run;

KachiM · Posted 02-14-2019 03:49 AM

Hi Rhapsody

It is my bad. I did not use %sysfunc(). It must be:

%let rc = %sysfunc(array2array(have, want));

This is equivalent to your Y = f(X). Here I used rc = f(X,Y).

View solution in original post

RW9 · Posted 02-06-2019 08:04 AM

Why would you want to use fcmp for that? To be honest, I have never seen a single scenario where fcmp is useful, it only ever adds obfuscation both to the code, and to any code you create (macro can at least be highlighted as macro for instance). I would highly advise to do this in a datastep and avoid catalogs/compiled obfuscated and hidden code completely.

rhapsody · Posted 02-06-2019 08:24 AM

I'm using SAS base 9.2

RW9 · Posted 02-06-2019 08:37 AM

Not sure how that is relevant?

ErikLund_Jensen · Posted 02-06-2019 02:22 PM

Hi @RW9

I think you are a little unjust to fcmp, It is very useful in a SAS data warehouse, where all development is done in DI Studio.

Best practice in DI Studio is to use standard transformations and avoid userwritten code as much as possible, which makes data steps the last resort. As a result, a typical DI Studio job contains many proc sql extracts and joins, where recoding is limited to sas functions and case constructs, stored deep inside the metadata server. But as a fcmp function works as built-in functions, they can also be used in proc sql, and do things that would be impossible otherwise, like transforming UTM coordinates to longitude/latitude.

And with 20 developers working in different departments and about 10000 extracts and joins in the daily batch, a centrally maintained and documented fcmp catalog gives much less obfuscation than all developers trying to solve problems with overly complicated case constructs and scores of unnecessary steps. So in this scenario the use of a few fcmp functions is a big help.

RW9 · Posted 02-07-2019 04:34 AM

Never used DIS, so can't comment on that. What I can say is two things:

1) "a centrally maintained and documented" - whilst I have often seen central macro libraries and I agree that is definately a plus point, documentation is nothing more than a ghost of whisper from my experience. Even the macro and variable naming is non-informative. At one of the big conferences we should ask for a show of hands who has documentation, coding standards, common object interface agreements in place...

2) I would have thought the recent migration from 32-64bit would have shown the real dangers surrounding proprietary file formats. To my knowledge there is still no upgrade path for a catalog (formats/macros, compiled functions etc.) from 32bit to 64bit - and as outsourced work is considered third party owned, we only get catalogs. Hence anything before the switch is now unusable. Hence I can never recommend catalogs for any purpose, only ever use plain open text which is openable in pretty much anything and cross portable. Sure if you have the source code you can go back and recompile, if you have the source code...

The reason I dislike fcmp more than compiled macros is that they don't even jump out as being non-standard SAS for example:

data want;
   set have;
   a=catx("x",d,e);
   b=cats("x",d,e);
   c=catr("x",d,e);
   d=catq("x",d,e);
run;

One of the those is a user fcmp, how do you easily pick it out - and note this is without the standard all in upper case, all on the same line with no indentations and surrounded by masses of macro code?

</rant>

ErikLund_Jensen · Posted 02-08-2019 05:25 AM

Hi @RW9

Thanks for your comments.

It never occured to me that fcmp functions could be difficult to pick out, because i know them by heart. We have named our fcmp's after the type of conversion they do, such as utm2lat or ssn2bd, and the digit in the middel of the name makes them recognizable. But that is sheer luck and works only if you know the rules, so you have certainly got a point there.

As to the catalog problem, we don't use catalogs for anything else than fcmp and formats in our production environment, and catalogs are read-only to users. The only way to get a catalog entry (format or fcmp) promoted to production is in the form of a program, that creates the entry, and we have jobs to run the programs and recreate the full content if necessary.

KachiM · Posted 02-06-2019 09:05 AM

Show your output Data Set as a result of your calculation as I am unable to visualize it.

regards

DATAsp

rhapsody · Posted 02-06-2019 09:11 AM

trans1 trans2 trans3

0.2 0.7 0.5

0.3 0.9 0.623

0.5 0.6 0.15

0.7 0.9 0.21

KachiM · Posted 02-06-2019 11:42 AM

rhapysody

Most simple solution is to use a Data Step to get what you want.

You seem to be interested to understand FCMP functionality. So,
I pass my little knowledge on FCMP to you! A FCMP function or
Subroutine consists of Data Step statements. Few matrix functions
exclusively belong to FCMP Environment. One big advantage is that we can hold
a Data Set as a two-dimensional Array (Matrix) in memory. Data
staying in memory reduces I/O cost. I am giving a function
later that reads a Data Set, holds in memory, manipulates cells.
Then I write out the resulting Array as a Data Set.

For SAS programmers: As long as you keep the source code of the function in safe place,
you need not worry about how it is saved as compiled code. There
is no secrecy or threat. It requires user-familiarity.

Your OUTPUT shows adding of columns of theta.

The simplest way is to pass a _temporary_ array from the Data Step
with values of theta and an another empty array to hold to-be-summed
values in FCMP. I filled the theta: values to t[]. I created empty trans[]
to pass to a Subroutine.

First the Subroutine ADD is made to receive two arrays t[] and
tr[]. Addition is made using t[] and the sum is assigned to tr[].
The use of OUTARGS shares the arrays both in FCMP Environment and
Data Step.

Before calling the Subroutine in the Data Step you tell the
location where the compiled one is saved by:

options cmplib = work.func;

In the Data Step, read theta: and save to the _temporary_ array
t[]. Create trans[] as a _temporary_ array.

You call the Subroutine Add, passing the two arrays.

Then copy the values from trans[] to tran[] array which writes out to WANT.

data have;
input theta1-theta3;
datalines;
0.2  0.5  0
0.3  0.6  0.023
0.5  0.1  0.05
0.7  0.2  0.01 
;
run;

proc fcmp outlib = work.func.arr;
   subroutine add(t[*], tr[*]);
      outargs t, tr;
      tr[1] = t[1];
      do i = 2 to dim(t);
         tr[i] = t[i-1] + t[i];
      end;
   endsub;
quit;

options cmplib = work.func;

data want;
   array t[3] _temporary_;
   array trans[3] _temporary_;
   
   set have;
   array th theta1-theta3;
   do i = 1 to dim(t);
      t[i] = th[i];
   end;
   call add(t,trans);
   array tran trans1-trans3;
   do i = 1 to dim(t);
      tran[i] = trans[i];
   end;
   drop i th:;
run;

The Output:

Obs 	trans1 	trans2 	trans3
1 	0.2 	0.7 	0.500
2 	0.3 	0.9 	0.623
3 	0.5 	0.6 	0.150
4 	0.7 	0.9 	0.210

A better way is to just pass the Data Set (HAVE) to a FCMP
function which reads the Data Set to a Dynamic Array and
after additions are done in all rows, write out the Dynamic
Array as a Data Set. Though your problem does not need this approach
but you can use this example to try bigger problems later on.

I have given explanations for most of the statements as COMMENT
in the following FCMP Function, DYNAMIC_ADD(). You are welcome
to ask any doubts in the code.

proc fcmp outlib = work.myfunc.lib;
   function dynamic_add(ds $);
   file log;
   array theta[1] / nosymbols; *Array head for THETA;
   array trans[1] / nosymbols; *array head for TRANS;

   rc = read_array('have', theta); *Dynamically sizes THETA and stores HAVE in a Matrix;
   put theta =; * See the row-col values of HAVE;
   rows = dim1(theta); cols = dim2(theta); * gets rows and cols of Matrix;
   call dynamic_array(trans, rows, cols);  * Create a Dynamic Array(TRANS) just like THETA;  
   do i = 1 to rows;
      do j = 1 to cols - 1;
         trans[i,1] = theta[i,1]; * First cell is copied;
         trans[i,j+1] = theta[i, j+1] + theta[i, j]; * Here comes your calculations;
      end;
   end;
   rc = write_array('WANTDYN', trans); * Write Out TRANS as a Data Set, WANTDYN;
   
   return(1); * A Function needs a return statement and a dummy value is returned;/*EDITED*/
   endsub;
quit;

options cmplib = work.myfunc; /*EDITED*/

data _null_;
   rc = dynamic_add('have');
run;

proc print data = wantdyn;
run;

You simply need one statement in the Data Step. The FCMP Function writes out the Data Set as WANTDYN which gives the same OUTPUT as shown earlier.

Best regards,

DATAsp

ballardw · Posted 02-07-2019 04:44 PM

@rhapsody wrote:

I have the table Bhat which has the variable theta according to

theta1       theta2          theta3

    0.2           0.5               0

    0.3           0.6               0.023

    0.5           0.1               0.05

    0.7           0.2               0.01

I want to use a function which knows it's an Array and takes the dimensions of the Array automatically into account. Below is what I've done so far but it doesn't work. The function I've done is called "add". Any ideas?

proc fcmp outlib=work.myfuncs.dates;

function add(theta);

trans1 = theta1;

do i=1 to dim(theta)-1;

trans(i+1) = theta(i+1) - theta(i);

end;

return(trans);

endsub;

run; quit;

/* set search path for custom functions */

options cmplib=work.myfuncs;

/* test with data set of values */

data test;

set Bhat;

/*array theta(*) theta1-theta2;*/

result=add(theta);

run;

First thing is it doesn't work because you have errors:

When I run your fcmp code:

1393  proc fcmp outlib=work.myfuncs.dates;
1394
1395  function add(theta);
1396  trans1 = theta1;
1397  do i=1 to dim(theta)-1;
ERROR: Argument number 1 to the DIM function must be an array.
1398  trans(i+1) = theta(i+1) - theta(i);
1399  end;
1400
1401  return(trans);
1402  endsub;
1403  run;

So you have declared your variable THETA incorrectly. I believer you are mixing array addressing trans(I+1) with variable trans1 and theta(I) and theta1. Since you did not pass a parameter theta1 it is not defined. To use the first value of a passed array you would use theta[1] or theta(1).

Trans has not be declared as an array so you can't actually use the trans(I+1) either.

Are you attempting to calculate a single value value or to pass an array back?

Your example call in the data step

result=add(theta);

would not work as if theta, at that point represents an array you cannot assign an array of values to a single result variable.

It may help to provide a data set with variable names that do not match the variables in your function so you can trace out referencing the data set variable for value and what you want to accomplish.

If you want to pass a modified array back I believe you would be in the realm of a SUBROUTINE, not function (generic definition of function often involves a result of a single variable). Example sum(a,b,c) returns a single value in a data step. Subroutine will let you specify OUTARGS. From the documentation:

OUTARGS

specifies arguments from the argument list that the subroutine should update.

ballardw · Posted 02-07-2019 10:18 PM

Hello ballardw

Your answer is excellent to fill the gap I left in my answer ! Thanks to you.

Kind regards,

DATAsp

rhapsody · Posted 02-13-2019 02:15 AM

Could you please provide a full working example to your answer?

KachiM · Posted 02-13-2019 04:37 AM

Hi Rhapsody,

You are returning to ask for a working example from Ballardw.

I have been thinking that my solution is not useful to solve your problem. Will not be good to share your thoughts? Do come and say your issues.

Regards,

DATAsp

rhapsody · Posted 02-13-2019 04:59 AM

To make it simple. What I want is a function which can take an Array as an input and and return an Array as output. The example doesn't really matter.

Y=something(X)

where "X" is an input Array and "Y" is an output Array. Something is the function.

use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Re: use Array in a proc fcmp function

Classroom Training Available!