BookmarkSubscribeRSS Feed
GehadElsayed123
Calcite | Level 5

i try to do codes of " phase one analysis of multivariate control charts with missing data "

i do first and second steps 

but i stop in the third step , as i should remove k% from data by (general pattern and MCAR) 

BUT i can't do it 

 

 

 

i want to know , how to make k% missing values from simulated data  ? 

6 REPLIES 6
Community_Guide
SAS Moderator

Hello @GehadElsayed123,


Your question requires more details before experts can help. Can you revise your question to include more information? 

 

Review this checklist:

  • Specify a meaningful subject line for your topic.  Avoid generic subjects like "need help," "SAS query," or "urgent."
  • When appropriate, provide sample data in text or DATA step format.  See this article for one method you can use.
  • If you're encountering an error in SAS, include the SAS log or a screenshot of the error condition. Use the Photos button to include the image in your message.
    use_buttons.png
  • It also helps to include an example (table or picture) of the result that you're trying to achieve.

To edit your original message, select the "blue gear" icon at the top of the message and select Edit Message.  From there you can adjust the title and add more details to the body of the message.  Or, simply reply to this message with any additional information you can supply.

 

edit_post.png

SAS experts are eager to help -- help them by providing as much detail as you can.

 

This prewritten response was triggered for you by fellow SAS Support Communities member @Reeza

.
Rick_SAS
SAS Super FREQ
proc iml;
call randseed(1);
N = 100;        /* number of time points */
t = 1:N;
X = j(1, N);    /* allocate vector */
call randgen(X, "Normal"); /* fill with random normal variates */

/* approximately 20% missing completely at random */
missIdx = sample( t, 0.2*N );   /* sample with replacement */
X[missIdx] = .;
call scatter(t, X) other="refline 0/axis=y;";


/* or sample witout replacement to get exactly 20% missing */
missIdx = sample( t, 0.2*N, "NoReplace" );
X[missIdx] = .;
GehadElsayed123
Calcite | Level 5

this code in case of the uni-variate data  but i need the code  in the case of multivariate  data 

 

Rick_SAS
SAS Super FREQ

Use the same program, where N = rows*columns is the total number of elements in the matrix. For example, if you want a 5x20 matrix use the previous code with N=100 and then reshape the vector into a matrix by using the SHAPE function:

Y = shape(X, 5, 20); /* convert to matrix */

 

If you prefer to rewrite the whole program in terms of rows and columns, you can do that, too:

proc iml;
call randseed(1);
N = 100;        /* number of time points (columns) */
p = 5;          /* number of rows */
X = j(p, N);    /* allocate matrix */
call randgen(X, "Normal"); /* fill with random normal variates */

/* approximately 20% missing completely at random */
t = 1:(N*p);    /* vector that contains all indices */
missIdx = sample( t, 0.2*N*p );   /* sample with replacement */
X[missIdx] = .;

If you plan on doing a lot of simulations, I recommend the book Simulating Data with SAS.

GehadElsayed123
Calcite | Level 5

i do it but i get error 

Rick_SAS
SAS Super FREQ

1. Do you want uncorrelated or correlated observations? It looks like you are trying to use the RANDNORMAL function. If so, see the program below for the correct syntax.

2. The second argument to the SAMPLE function is the sample size, which is the number of indices in the range 1:n*p. This value must be greater than zero.

proc iml;
mean={0 0};
cov={1 0,
     0 1};
p = ncol(mean);  /* number of variables */
m=20;            /* number of observations */
X=randnormal(m,mean,cov); /* multivariate normal */

t= 1:(m*p);
/* 2nd argument is not a probability, it is the number of times to 
   sample from t. Therefore the second argument must be an integer > 0 */
missIdx= sample(t, round(0.1*m*p) );  
X[missIdx]=.;

If you want to allow the possibility that no elements are set to missing, you can use a different method for generating the missing values. In the following statements, an nxp random matrix is generated where each cell has probability=0.01 of being 1.

 

/* Different approach: Each cell has 0.01 probability of being missing. */
B = j(m, p);
call randgen(B, "Bernoulli", 0.01);
missIdx=loc(B=1);
if ncol(idx)>0 then X[missIdx]=.;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 1284 views
  • 0 likes
  • 3 in conversation