BookmarkSubscribeRSS Feed
Salah
Quartz | Level 8

Hello

I am generating a vector X_p of size (m x1) the entries are failure times so they are ranked from smallest to largest. I set the study to end at time T(say 0.9). If (X_1,...., X_m) all occurred before T then the program is carried on to find the MLE and other estimators. If time T occurs before getting my m failures then  I have to extract the values that come after T (say m-q), then use them to generate new values from the truncated function. 

The problem that I am facing now is that these new data (I call it X_q_m) have a failure time that is smaller than Time T!!! I tried sorting the extracted values before using them to generate the new data but I failed.

Can anyone help me solving this problem, please?

proc iml;
   n=30;
   m=20;
   R=J(m, 1, 0);
   R[m]=10;
   T=1.3;
   seed=2;
   k=1;
   alpha=1;
   beta=1;

   Do N1=1 to K;
      ** progressive censoring ***;
      ****************************;
      W=J(m, 1, 0);
      V=J(m, 1, 0);
      X_p=J(m, 1, 0);
      X_A=J(m, 1, 0);
      U=J(m, 1, 0);
      X=J(m, 1, 0);

      do i=1 to m;
         W[i]=Uniform(seed);
      end;
      S=1+R[m];
      V[1]=W[1]**(1/S);

      do i=2 to m;
         S=S+(1+R[m-i+1]);
         V[i]=W[i]**(1/S);
      end;

      do i=1 to m;
         U[i]=1-prod(V[m:(m-i+1)] );
         ** the U's are the required progressively type II from U(0,1);
      end;
      X_p=((1-U)##(-1/alpha) -1)/beta;
      Call sort (X_p);
      
      * Find the position of the larget failure < T;
      do idx=1 to m-1;
         if ((X_p[idx] < T) & (X_p[idx+1] >=T)) then
            q=idx;
      end;
      
      * If the mth failure is larger than T;
      If X_p[m]>T then do;
         X_1_q=J(q, 1, 0);
         X_1_q=X_p[1:q];

         /* extract those failures that are less than T i.e. all X_1,...,X_q,X_(q+1) */
         Xp_remaining=J(m-q, m, 0);
         Xp_remaining=X_p[q+1:m];
         X_qm=Xp_remaining;
         *Call Sort (X_qm);
         Rq=R[1:q];
         
         **************************************************;
         ** Generate data of size m-q **;
         **************************************************;
         W1=J(m-q, 1, 0);
         U1=J(m-q, 1, 0);
         V1=J(m-q, 1, 0);
         X_q_m=J(m-q, 1, 0);
         X_A=J(m, 1, 0);

         do i=1 to m-q;
            W1[i]=Uniform(seed);
         end;
         S1=1+R[m-q];
         V1[1]=W1[1]**(1/S1);

         do i=2 to m-q;
            S1=S1+(1+R[m-i+1]);
            V1[i]=W1[i]**(1/S1);
         end;

         do i=1 to m-q;
            prod1=prod(V1[(m-q):((m-q)-i+1)] );
            U1[i]=1-(prod1);
            ** the U's are the required progressively type II from U(0,1);
         end;
         X_q_m=((U1#(1+beta#X_qm)##(-alpha)/(alpha*beta) )##(-1/(alpha+1))-1)/beta;
         call sort(X_q_m);
         X_A=X_1_q // X_q_m;
         X=X_A;
      end;
      *end the list for first if statement;
      else if (X_p[m]<=T) then do;
         q=m-1;
         X=X_p;
      end;
      *end the list for the else if statement;
   End;
   print T q X_q_m X_p X_A;
   quit;
6 REPLIES 6
Rick_SAS
SAS Super FREQ

When I run your program, I get the following output.

Region Capture.png

From your description, it sounds like you want to look at X_p and decide whether 

1. All values are prior to T, -OR-

2. Some values are > T. Extract those to X_q_m.

Is that right? If not, explain what values are wrong and what the correct answer should be.

 

Rick_SAS
SAS Super FREQ

Are you perhaps confusing X_qm and X_q_m?  By definition, all elements of X_qm are greater than T.  However, X_q_m is the expression

X_q_m=((U1#(1+beta#X_qm)##(-alpha)/(alpha*beta) )##(-1/(alpha+1))-1)/beta;

and those elements are not necessarily greater than T.

Salah
Quartz | Level 8

Hello Rick

 

Thank you for taking the time to look at my problem

X_qm are all greater than T and that is why I extracted them because these failures happened before time T. Instead I will be replacing them with X_qm (qm means that the size of this new data is m-q)

where q is the position where I started distracting the data. 

I do understand that X_qm not necessarily greater than T. But research-wise that doesn't make any sense since these failure's time should occur after T. I am attaching a diagram that may help explain what I am trying to do.SAS-pic.PNG

Rick_SAS
SAS Super FREQ

Two comments:

1. In your program, you have two different variables that are using a similar name. I strongly recommend that you rename X_q_m to something else (Y?) so that it is easier to distinguish from X_qm.

 

For example, you say 

> X_qm are all greater than T 

and later

> I do understand that X_qm not necessarily greater than T.

Both statements cannot be true, and I think if you rename X_q_m it will be easier to debug your program.

 

2. For debugging purposes, the initial vectors of times (X_p) does not need to be random. Use a simple vector like X_p = {0.2, 0.4, 0.8, 1.1, 1.2, 1.4, 1.5, 2.0} and concetrate your efforts on the second part of the program, which is where your mistake is.

Salah
Quartz | Level 8

For example, you say 

> X_qm are all greater than T 

and later

> I do understand that X_qm not necessarily greater than T.

Both statements cannot be true, and I think if you rename X_q_m it will be easier to debug your program.

 

I am sorry for the confusion what I meant to say is that since the failures supposed occurred after T, then their values must be > T. That is why I attached the diagram since it shows the two if statement that I am using in my code.

 

You pointed out in your previous reply that X_qm (now I call it Y) that it is not necessary > or < T. I meant to say that I agree with this notion but this doesn't work for my research. 

 

For debugging purposes, the initial vectors of times (X_p) does not need to be random. Use a simple vector like X_p = {0.2, 0.4, 0.8, 1.1, 1.2, 1.4, 1.5, 2.0} and concetrate your efforts on the second part of the program, which is where your mistake is.

 

To reply to this point, if you change R[m]=10 to R[1]=10  for example then there will be no problem.

Any changes in the parameter may/may not create this problem to me.

Since I have to take different values of T, R, alpha, beta, and so on so I can't guarantee not to have this problem. 

 

The problem is that these values create a huge bias for the MLE!!!!

 

Thank you

 

 

 

 

 

 

 

 

 

 

proc iml;
n=30;
m=20;
R=J(m, 1, 0);
R[m]=10;
T=1.3;
seed=2;
k=1;
alpha=1;
beta=1;

Do N1=1 to K;
** progressive censoring ***;
****************************;
W=J(m, 1, 0);
V=J(m, 1, 0);
X_p=J(m, 1, 0);
X_A=J(m, 1, 0);
U=J(m, 1, 0);
X=J(m, 1, 0);

do i=1 to m;
W[i]=Uniform(seed);
end;
S=1+R[m];
V[1]=W[1]**(1/S);

do i=2 to m;
S=S+(1+R[m-i+1]);
V[i]=W[i]**(1/S);
end;

do i=1 to m;
U[i]=1-prod(V[m:(m-i+1)] );
** the U's are the required progressively type II from U(0,1);
end;
X_p=((1-U)##(-1/alpha) -1)/beta;
Call sort (X_p);

* Find the position of the larget failure < T;
do idx=1 to m-1;
if ((X_p[idx] < T) & (X_p[idx+1] >=T)) then
q=idx;
end;

* If the mth failure is larger than T;
If X_p[m]>T then do;
X_1_q=J(q, 1, 0);
X_1_q=X_p[1:q];

/* extract those failures that are less than T i.e. all X_1,...,X_q,X_(q+1) */
Xp_remaining=J(m-q, m, 0);
Xp_remaining=X_p[q+1:m];
X_qm=Xp_remaining;
*Call Sort (X_qm);
Rq=R[1:q];

**************************************************;
** Generate data of size m-q **;
**************************************************;
W1=J(m-q, 1, 0);
U1=J(m-q, 1, 0);
V1=J(m-q, 1, 0);
Y=J(m-q, 1, 0);
X_A=J(m, 1, 0);

do i=1 to m-q;
W1[i]=Uniform(seed);
end;
S1=1+R[m-q];
V1[1]=W1[1]**(1/S1);

do i=2 to m-q;
S1=S1+(1+R[m-i+1]);
V1[i]=W1[i]**(1/S1);
end;

do i=1 to m-q;
prod1=prod(V1[(m-q):((m-q)-i+1)] );
U1[i]=1-(prod1);
** the U's are the required progressively type II from U(0,1);
end;
Y=((U1#(1+beta#X_qm)##(-alpha)/(alpha*beta) )##(-1/(alpha+1))-1)/beta;
call sort(Y);
X_A=X_1_q // Y;
X=X_A;
end;
*end the list for first if statement;
else if (X_p[m]<=T) then do;
q=m-1;
X=X_p;
end;
*end the list for the else if statement;
End;
print T q Y X_p X_A;
quit;

 

 

 

 

 

 

 

 

Rick_SAS
SAS Super FREQ

It is not clear whether you are asking a research question or a SAS/IML programming question. If it is a research question, perhaps you can discuss it with your advisor or a colleague.  If it is a programming problem, then please tell us what is wrong with the computation and what the correct answer should be.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 812 views
  • 0 likes
  • 2 in conversation