Solved: CDF of Normal in PROC IML

abbaskashif · Posted 11-19-2016 04:30 PM

Hi,

Below is an example of a code that works fine and matrix of x is correct. The problem occurs with y as I am getting a matrix of 1s only. Is it because Quantile function is not supported by IML? I am new to SAS IML. Please help.

/* Inverse CDF of Binomial PD*/
%let N = 100;
%let M= 3;
proc iml;
call randseed(123);
x = j(&N,&M);
y = j(&N,&M);
mu= j(1,&M);
stdev= j(1,&M);
mu= {.2 .1 .5};
stdev= mu#(1-mu);
call randgen(x, "Normal",mu, stdev);
y= Quantile("NORMAL",x);
print mu;
print x;
print y;
run;

Rick_SAS · Posted 11-20-2016 07:19 PM

Vectors of paremeters for the RANDGEN subroutine were introduced in SAS/IML 12.1 (SAS 9.3m2).

Sorry for the confusion about row vectors and column vectors. You are right (the doc is right) about the way that dimensions are used. In my own work, I make the parameter vector be a row vector (1 row, m columns) when I want the parameters to apply to the columns. I make the parameter vector be a column vector (n rows, 1 column) when I want the parameters to apply to the rows. As y ou point out, this convention is not strictly required, but I think it is a good idea.

Here are examples. In the first example I compute the mean of each column of x and show that the column means are close to the population mean. In the second example, I compute the sample mean for each row of y and show that the row means are close to the population means.

proc iml;
call randseed(1);

x = j(100, 3);  /* 3 cols */
mu  = 1:3;   /* row vector: ncol(mu)=ncol(x) */
call randgen(x, "normal", mu);
mean = mean(x); /* mean of each col */
print mu, mean;

y = j(5, 200);  /* 5 rows */
mu2  = T(1:5);   /* col vector: nrow(mu)=nrow(y ) */
call randgen(y, "normal", mu2);
mean2 = y[,:];  /* mean of each row */
print mu2 mean2;

View solution in original post

Ksharp · Posted 11-19-2016 10:01 PM

What are you looking for?
Your code is not right.
If you want get Quantile, you need offer probability not the value (x) from Normal Distribution.

Ksharp · Posted 11-19-2016 10:06 PM

I think you want get CDF not Quantile. You take x(value from normal distribution) as Quantile ?

/* Inverse CDF of Binomial PD*/
%let N = 100;
%let M= 3;

proc iml;
call randseed(123);
x = j(&N,&M);
y = j(&N,&M);
cdf=j(&N,&M);
mu= {.2 .1 .5};
stdev= mu#(1-mu);
call randgen(x, "Normal",mu, stdev);

do i=1 to &M;
  cdf[,i]=cdf('normal',x[,i],mu[i],stdev[i]);
end;
print mu,stdev,x,cdf;
run;

abbaskashif · Posted 11-21-2016 04:41 PM

Thanks. I realized the mistake 🙂

Rick_SAS · Posted 11-20-2016 06:39 AM

The QUANTILE function requires an argument that is between 0 and 1. Please review the definitions of the four different probability functions PDF, CDF, QUANTILE, and RAND/RANDGEN.

You can compute the CDF (P(X < x)) for each of th e random values. You can then call the QUANTILE (inverse CDF) function to recover the x values, as shown in KSharp's code.

One thing to keep in mind: the RANDGEN function is vectorized SAS/IML function. You are calling it with a vector of parameters so that each column has different parameters. That is legal, since the doc says the parameters can be scalar, row vector (apply to cols), column vector (apply to rows), or matrix. However, the CDF and QUANTILE functions are Base SAS functions, so they only support scalar or matrix parameters. That is why KSharp used a loop over the columns. The other option is to create a matrix of parameters, like this:

mu_mat = repeat(mu, &N);
stdev_mat = repeat(stdev, &N);
y = cdf("NORMAL",x, mu_mat, stdev_mat);
q = quantile("Normal", y, mu_mat, stdev_mat); /* same as x */

abbaskashif · Posted 11-20-2016 04:16 PM

Thank you all for your quick responses. I didn't know we had such a cooperative SAS community.

Creating matrices of parameters (mu_mat and stdev_mat) is quite intuitive and seems to be working fine. However, it seems that I am having some problem with RANDGEN too. Although I am specifying the parameters as vectors they seem to be taken as scalars. Specifically, RANDGEN seems to be taking only the first parameter (.01) in the vector below and applies it to whole matrix.

mu= j(1, &M);
mu= {.01 .1 .6};

It would also help if you could elaborate a bit more on what do you mean by 'row vector (apply to cols), column vector (apply to rows)'. I was goin gthrough the doc and it says: "if the parameters contain m elements, the jth column of the result matrix consists of random values drawn from the distribution with parameters param1[j], param2[j], and param3[j]".

Thanks in advance.

Rick_SAS · Posted 11-20-2016 07:19 PM

Vectors of paremeters for the RANDGEN subroutine were introduced in SAS/IML 12.1 (SAS 9.3m2).

Sorry for the confusion about row vectors and column vectors. You are right (the doc is right) about the way that dimensions are used. In my own work, I make the parameter vector be a row vector (1 row, m columns) when I want the parameters to apply to the columns. I make the parameter vector be a column vector (n rows, 1 column) when I want the parameters to apply to the rows. As y ou point out, this convention is not strictly required, but I think it is a good idea.

Here are examples. In the first example I compute the mean of each column of x and show that the column means are close to the population mean. In the second example, I compute the sample mean for each row of y and show that the row means are close to the population means.

proc iml;
call randseed(1);

x = j(100, 3);  /* 3 cols */
mu  = 1:3;   /* row vector: ncol(mu)=ncol(x) */
call randgen(x, "normal", mu);
mean = mean(x); /* mean of each col */
print mu, mean;

y = j(5, 200);  /* 5 rows */
mu2  = T(1:5);   /* col vector: nrow(mu)=nrow(y ) */
call randgen(y, "normal", mu2);
mean2 = y[,:];  /* mean of each row */
print mu2 mean2;

abbaskashif · Posted 11-21-2016 04:38 PM

Thanks for your detailed explanation on parameters as vectors/matrices.

Although the RANDGEN function should work on a vector of parameters, it doesn't give me the intended results. Even after generating a full matrix of parameter p in the code below by repeating the row vector (effectively a N*M matrix), the results clearly suggest that p is being taken as a scalar i.e. only the first element of p=0.1 from the matrix is used. Do you have any suggestions?

%let N = 100;
%let M= 3;
proc iml;
call randseed(123);
x = j(&N,&M);
y = j(&N,&M);
p= j(1, &M);/*specify row vector*/
p= {.1 .2 .5}; /*Probability*/
p_mat = repeat(p, &N); /*Repeat probability parameter for j(&N,&M) matrix*/
call randgen(x, "Binomial",p_mat,&N );
x=x/&N;
mean=mean(x);
max=max(x);
print p,mean,max;
Quit;

RESULT:

p

0.1

0.2

0.5

mean

0.1014

0.1001

0.1075

max

0.2

abbaskashif · Posted 11-28-2016 02:35 PM

/*NORMINV of Random Binomial Probability of Default*/
%let N = 10;
%let M= 3;
proc iml;
call randseed(123);
x = j(&N,&M);
cdf=j(&N,&M);
q=j(&N,&M);
p = {.100 .200 .500};
p_mat = repeat(p, &N);
stdev_mat=p_mat#(1-p_mat);
N_mat = repeat(&N, &N, &M);
call randgen(x, "Binomial", p_mat, N_mat);
x=x/&N;/*Random probability of default*/
q = quantile("Normal", x, p_mat, stdev_mat); /*Quantile of the probability
 from normal distribution (Distance to default)*/
mean=mean(x);
max=x[<>,];
min=x[><,];
print p,mean,max,min,x,q;
Quit;

Hi Rick_SAS,

I have refined the code with your help and now I am passing on a matrix of probabilities to the function using the formula you suggested. However, I am getting an error on Quantile function again i.e. ERROR: (execution) Invalid argument to function. Just to give you some background: I am trying to generate random variables using the mean probability of default ( matrix x) and then want to calculate distance to default by finding the quantile ( matrix q) assuming normal distribution. Thanks for your help in advance.

Ksharp · Posted 11-28-2016 10:34 PM

Because there are zero in X .
You can't use 0 as a argument in function QUANTILE() .



x
0.1	0	0.3
0.1	0.1	0.4
0	0.1	0.3
0	0.4	0.4
0.1	0.5	0.6
0.2	0.1	0.6
0	0.2	0.7
0	0	0.3
0.1	0.3	0.7
0.2	0.2	0.7

abbaskashif · Posted 11-29-2016 12:20 PM

You are right thanks. I'll have tofigure out a way to get around this.

PeterClemmensen · Posted 11-20-2016 09:55 AM

I see you already got your answer, but off topic: remember to put a QUIT statement at the end of your code 🙂

In IML the RUN statement is a bit different from most other procedures and the data step. In IML the RUN statement executes built-in subroutines or user-defined modules.

The DATA to DATA Step Macro
Blog: SASnrd

Rick_SAS · Posted 11-20-2016 03:00 PM

To add to draycut's message, see the article "Never end PROC IML with a RUN statement."

abbaskashif · Posted 11-21-2016 04:45 PM

Thanks for your advice. I realize that IML is different but quite convenient too if one is good at it.

CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

Re: CDF of Normal in PROC IML

SAS Innovate 2025: Register Now