Statistical programming, matrix languages, and more

Repeated Morisita index

Reply
Occasional Contributor
Posts: 6

Repeated Morisita index

I have n forests that I want to compute Moristia index. I have the code for one forest but I do not know how to repeat the program n times. The code for Morisita Index is the following:

data data;

  infile "B:\sas\data.csv" delimiter=',' firstobs=2; * read the csv data with coordinates of the trees and multiple forests;

  input forestid $ x y diameter;

  if forestid ne 1 then delete; * select only one forest;

run;

*start Morosita's index computations;

proc iml;

/** Count the number of points in each cell of a 2D grid.

    Input: p    - a kx2 matrix of 2D points

           XDiv - a subdivision of the X coordinate direction

           YDiv - a subdivision of the Y coordinate direction

    Output: The number of points in p that are contained in

            each cell of the grid defined by XDiv and YDiv. **/

start Bin2D(p, XDiv, YDiv);

   x = p[,1];       y = p[,2];

   Nx = ncol(XDiv); Ny = ncol(YDiv);

   /** Loop over the cells. Do not loop over observations! **/

   count = j(Ny-1, Nx-1, 0); /** initialize counts for each cell **/

   do i = 1 to Ny-1; /** for each row in grid **/

      idx = loc((y >= YDiv) & (y < YDiv[i+1]));

      if ncol(idx) > 0 then do;

         t = x[idx]; /** extract point in this row (if any) **/

         do j = 1 to Nx-1; /** count points in each column **/

            count[i,j] = sum((t >= XDiv) & (t < XDiv[j+1]));

         end;

      end;

   end;

   return( count );

finish;

use data ; *bring external data into IML environment (SAS data but not data curently inside IML procedure);

read all var {x y} into p[colname=varNames]; *place your data in the matrix p that can be used by bining module;

/** define bins: 50 columns and 50 rows on [0,1000]x[0,1000] **/

cellsize=100;

XMin = 0; XMax = 1000; dx = cellsize;

YMin = 0; YMax = 1000; dy = cellsize;

XDiv = do(XMin, XMax, dx);

YDiv = do(YMin, YMax, dy);

count = Bin2D(p, XDiv, YDiv);

*create the dataset for MORISITA Index of Dispersion (1959);

create quad from count;

append from count;

close quad;

run;quit;

*MORISITA Index of Dispersion (1959);

data quad;

  set quad;

  index=catx("_", "Row",_n_); *create a new varaibale that concatenate the word Row with the row counting;

run;

*prepare data for computing Morisita index;

proc sort data=quad;

  by index;

run;

*create one column from the matrix counting the points inside bins;

Proc transpose data=quad out=mor (rename=(col1=count));

  by index;

run;

*start computing Morisita index;

data mor;

  set mor;

  c=count*(count-1);

run;

proc means data=mor noprint;

  output out=morisita sum(c)=sumc sum(count)=points ;

run;quit;

data morisita;

  set morisita;

  MInd=sumc*_freq_/(points*(points-1));

  pvalue=1-probchi((mind*(points-1)+_freq_-points)/(_freq_-1),_freq_-1);

run;

proc print data=morisita;

run; quit;

What a want is to have the Morisita's index computed for all forests and saved in one dataset. I tried macro, as they seems to be recommended to do repetitive tasks, but the tutorials and helps are designed for so simple tasks that I did not know why one should bother using them, while the complicated programs are so complicated that are useless. I was expecting some examples for this kind of situations, as it seems to be encountered very often in practice.

Any advice?

Super Contributor
Posts: 644

Re: Repeated Morisita index

Here is an outline a macro .  Basically you want to put almost all of your code into a macro, lets call it MorisitaIndx.  Added code to make it generic is in bold.

%Macro MorisitaIndx (forest) ;

data usedata ;

     Set data (where = (forestid = &forest)) ;  /*  data will be read outside macro */

run ;

*start Morosita's index computations;

proc iml;

/** Count the number of points in each cell of a 2D grid.

....

*/

....

Proc IML

....

use usedata ; *bring external data into IML environment (SAS data but not data curently inside IML procedure);

...

data morisita;

  Retain forestid &forest ; /* use if your code does not keep the forestid variable */

  set morisita;

  MInd=sumc*_freq_/(points*(points-1));

  pvalue=1-probchi((mind*(points-1)+_freq_-points)/(_freq_-1),_freq_-1);

run;

proc print data=morisita;

run;

Proc Append

     Base = AllMorisita

     data = morisita

     ;

run ;

%Mend ;

data data;

  infile "B:\sas\data.csv" delimiter=',' firstobs=2; * read the csv data with coordinates of the trees and multiple forests;

  input forestid $ x y diameter;

     /* delete this next line */

  * if forestid ne 1 then delete; * select only one forest;

run;

/* List the forests */

Proc SQL ;

     create table forests as

     select distinct forestid

     from data

Quit ;

/* Run through the macro for each forest */

Data _Null_ ;

     Set forests ;

     ExecStr = cats ('%MorisitaIndx (' , forest, ') ;' ) ;

     Call Execute (ExecStr) ;

Run ;

Warning - untested code - you might have to adjust the data _null_ step

Richard


Occasional Contributor
Posts: 6

Re: Repeated Morisita index

Dear Richard,

Thank you for your time and effort in replaying to my question. I tried to understand your dots, but I couldn't, especially the ones following the second PROC IML. I ran the code but it did not worked. Thank you again.

Bogdan

SAS Super FREQ
Posts: 3,222

Re: Repeated Morisita index

If you are using SAS/IML 9.3, you can now simplify the binning code. See A simple implementation of two-dimensional binning - The DO Loop

Occasional Contributor
Posts: 6

Re: Repeated Morisita index

Thank you. I will be looking into it.

best wishes,

Bogdan

Post a Question
Discussion Stats
  • 4 replies
  • 428 views
  • 2 likes
  • 3 in conversation