Statistical programming, matrix languages, and more

CDF of Normal in PROC IML

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

CDF of Normal in PROC IML

[ Edited ]

Hi,

 

Below is an example of a code that works fine and matrix of x is correct. The problem occurs with y as I am getting a matrix of 1s only. Is it because Quantile function is not supported by IML? I am new to SAS IML. Please help.

 

/* Inverse CDF of Binomial PD*/
%let N = 100;
%let M= 3;
proc iml;
call randseed(123);
x = j(&N,&M);
y = j(&N,&M);
mu= j(1,&M);
stdev= j(1,&M);
mu= {.2 .1 .5};
stdev= mu#(1-mu);
call randgen(x, "Normal",mu, stdev);
y= Quantile("NORMAL",x);
print mu;
print x;
print y;
run;


Accepted Solutions
Solution
‎11-21-2016 12:56 PM
SAS Super FREQ
Posts: 3,381

Re: CDF of Normal in PROC IML

Vectors of paremeters for the RANDGEN subroutine were introduced in SAS/IML 12.1 (SAS 9.3m2).

 

Sorry for the confusion about row vectors and column vectors. You are right (the doc is right) about the way that dimensions are used. In my own work, I  make the parameter vector be a row vector (1 row, m columns) when I want the parameters to apply to the columns.  I make the parameter vector be a column vector (n rows, 1 column)  when I want the parameters to apply to the rows. As y ou point out, this convention is not strictly required, but I think it is a good idea.  

 

Here are examples. In the first example I compute the mean of each column of x and show that the column means are close to the population mean.  In the second example, I compute the sample mean for each row of y and show that the row means are close to the population means.

 

proc iml;
call randseed(1);

x = j(100, 3);  /* 3 cols */
mu  = 1:3;   /* row vector: ncol(mu)=ncol(x) */
call randgen(x, "normal", mu);
mean = mean(x); /* mean of each col */
print mu, mean;

y = j(5, 200);  /* 5 rows */
mu2  = T(1:5);   /* col vector: nrow(mu)=nrow(y ) */
call randgen(y, "normal", mu2);
mean2 = y[,:];  /* mean of each row */
print mu2 mean2;

 

View solution in original post


All Replies
Grand Advisor
Posts: 9,558

Re: CDF of Normal in PROC IML

What are you looking for?
Your code is not right.
If you want get Quantile, you need offer probability not the value (x) from Normal Distribution.

Grand Advisor
Posts: 9,558

Re: CDF of Normal in PROC IML

I think you want get CDF not Quantile. You take x(value from normal distribution) as Quantile ?

 

 

/* Inverse CDF of Binomial PD*/
%let N = 100;
%let M= 3;

proc iml;
call randseed(123);
x = j(&N,&M);
y = j(&N,&M);
cdf=j(&N,&M);
mu= {.2 .1 .5};
stdev= mu#(1-mu);
call randgen(x, "Normal",mu, stdev);

do i=1 to &M;
  cdf[,i]=cdf('normal',x[,i],mu[i],stdev[i]);
end;
print mu,stdev,x,cdf;
run;
Occasional Contributor
Posts: 13

Re: CDF of Normal in PROC IML

Thanks. I realized the mistake Smiley Happy
SAS Super FREQ
Posts: 3,381

Re: CDF of Normal in PROC IML

The QUANTILE function requires an argument that is between 0 and 1.   Please review the definitions of the four different probability functions PDF, CDF, QUANTILE, and RAND/RANDGEN.

 

You can compute the CDF (P(X < x))  for each of th e random values. You can then call the QUANTILE (inverse CDF) function to recover the x values, as  shown in KSharp's code.

 

 

One thing to keep in mind: the RANDGEN function is vectorized SAS/IML function. You are  calling it with a vector of parameters so that each column has  different parameters. That is legal, since the doc says the parameters can be scalar, row  vector (apply to cols), column vector (apply to rows), or matrix.  However, the CDF and QUANTILE functions are  Base SAS functions, so they only support scalar or matrix parameters.  That is why KSharp  used a loop over the columns. The   other option is to create a matrix of parameters, like this:

 

mu_mat = repeat(mu, &N);
stdev_mat = repeat(stdev, &N);
y = cdf("NORMAL",x, mu_mat, stdev_mat);
q = quantile("Normal", y, mu_mat, stdev_mat); /* same as x */

 

Occasional Contributor
Posts: 13

Re: CDF of Normal in PROC IML

Thank you all for your quick responses. I didn't know we had such a cooperative SAS  community.

 

Creating matrices of parameters (mu_mat and stdev_mat) is quite intuitive and seems to be working fine. However, it seems that I am having some problem with RANDGEN too. Although I am specifying the parameters as vectors they seem to be taken as scalars. Specifically, RANDGEN seems to be taking only the first parameter (.01) in the vector below and applies it to whole matrix.

 

mu= j(1, &M);
mu= {.01 .1 .6};

 

It would also help if you could elaborate a bit more on what do you mean by 'row vector (apply to cols), column vector (apply to rows)'. I was goin gthrough the doc and it says: "if the parameters contain m elements, the jth column of the result matrix consists of random values drawn from the distribution with parameters param1[j], param2[j], and param3[j]".

 

Thanks in advance.

 

 

Solution
‎11-21-2016 12:56 PM
SAS Super FREQ
Posts: 3,381

Re: CDF of Normal in PROC IML

Vectors of paremeters for the RANDGEN subroutine were introduced in SAS/IML 12.1 (SAS 9.3m2).

 

Sorry for the confusion about row vectors and column vectors. You are right (the doc is right) about the way that dimensions are used. In my own work, I  make the parameter vector be a row vector (1 row, m columns) when I want the parameters to apply to the columns.  I make the parameter vector be a column vector (n rows, 1 column)  when I want the parameters to apply to the rows. As y ou point out, this convention is not strictly required, but I think it is a good idea.  

 

Here are examples. In the first example I compute the mean of each column of x and show that the column means are close to the population mean.  In the second example, I compute the sample mean for each row of y and show that the row means are close to the population means.

 

proc iml;
call randseed(1);

x = j(100, 3);  /* 3 cols */
mu  = 1:3;   /* row vector: ncol(mu)=ncol(x) */
call randgen(x, "normal", mu);
mean = mean(x); /* mean of each col */
print mu, mean;

y = j(5, 200);  /* 5 rows */
mu2  = T(1:5);   /* col vector: nrow(mu)=nrow(y ) */
call randgen(y, "normal", mu2);
mean2 = y[,:];  /* mean of each row */
print mu2 mean2;

 

Occasional Contributor
Posts: 13

Re: CDF of Normal in PROC IML

Thanks for your detailed explanation on parameters as vectors/matrices.

 

Although the RANDGEN function should work on a vector of parameters, it doesn't give me the intended results. Even after generating a full matrix of parameter p in the code below by repeating the row vector (effectively a N*M matrix), the results clearly suggest that p is being taken as a scalar i.e. only the first element of p=0.1 from the matrix is used. Do you have any suggestions?


%let N = 100;
%let M= 3;
proc iml;
call randseed(123);
x = j(&N,&M);
y = j(&N,&M);
p= j(1, &M);/*specify row vector*/
p= {.1 .2 .5}; /*Probability*/
p_mat = repeat(p, &N); /*Repeat probability parameter for j(&N,&M) matrix*/
call randgen(x, "Binomial",p_mat,&N );
x=x/&N;
mean=mean(x);
max=max(x);
print p,mean,max;
Quit;

 

RESULT:

 

p

0.10.20.5

mean

0.10140.10010.1075

max

0.2
Occasional Contributor
Posts: 13

Re: CDF of Normal in PROC IML

/*NORMINV of Random Binomial Probability of Default*/
%let N = 10;
%let M= 3;
proc iml;
call randseed(123);
x = j(&N,&M);
cdf=j(&N,&M);
q=j(&N,&M);
p = {.100 .200 .500};
p_mat = repeat(p, &N);
stdev_mat=p_mat#(1-p_mat);
N_mat = repeat(&N, &N, &M);
call randgen(x, "Binomial", p_mat, N_mat);
x=x/&N;/*Random probability of default*/
q = quantile("Normal", x, p_mat, stdev_mat); /*Quantile of the probability
 from normal distribution (Distance to default)*/
mean=mean(x);
max=x[<>,];
min=x[><,];
print p,mean,max,min,x,q;
Quit;

Hi 

 

I have refined the code with your help and now I am passing on a matrix of probabilities to the function using the formula you suggested. However, I am getting an error on Quantile function again i.e. ERROR: (execution) Invalid argument to function.  Just to give you some background: I am trying to generate random variables using the mean probability of default ( matrix x) and then want to calculate distance to default by finding the quantile ( matrix q) assuming normal distribution. Thanks for your help in advance.  

Grand Advisor
Posts: 9,558

Re: CDF of Normal in PROC IML

Because there are zero in X .
You can't use 0 as a argument in function QUANTILE() .



x
0.1	0	0.3
0.1	0.1	0.4
0	0.1	0.3
0	0.4	0.4
0.1	0.5	0.6
0.2	0.1	0.6
0	0.2	0.7
0	0	0.3
0.1	0.3	0.7
0.2	0.2	0.7

Occasional Contributor
Posts: 13

Re: CDF of Normal in PROC IML

You are right thanks. I'll have tofigure out a way to get around this.

Super Contributor
Posts: 498

Re: CDF of Normal in PROC IML

I see you already got your answer, but off topic: remember to put a QUIT statement at the end of your code Smiley Happy

 

In IML the RUN statement is a bit different from most other procedures and the data step. In IML the RUN statement executes built-in subroutines or user-defined modules.

SAS Super FREQ
Posts: 3,381

Re: CDF of Normal in PROC IML

To add to draycut's message, see the article "Never end PROC IML with a RUN statement."

Occasional Contributor
Posts: 13

Re: CDF of Normal in PROC IML

Thanks for your advice. I realize that IML is different but quite convenient too if one is good at it.
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 566 views
  • 5 likes
  • 4 in conversation