## Randomly delete variables in a by variable group

Solved
Occasional Contributor
Posts: 8

# Randomly delete variables in a by variable group

I have a dataset with one variable used for by processing and one or more other variables used for analysis. So the table looks something like the below:

By_var   X1    X2   ...  Xn

1          234     4    ...   12

...            ...   ...    ...     ...

10          33     4     ...    4

Here's where things get weird:

I'd like to randomly select half of the x variables and fill them with a missing value. The x variables that get blanked out should be the same for every observation with the same by_var value, and I'd like this random slection of variables to be allowed to change with each by_var group. I'm really at a loss on where to even start this. Any ideas?

Accepted Solutions
Solution
‎09-26-2015 03:43 PM
Super User
Posts: 23,724

## Re: Randomly delete variables in a by variable group

Something like the following:

1. Set up an array for current X variables

2. Set up partner array to assign 1/0 using bernoulli random with 50% chance so 50% are missing

3. Set partner array only when first of group var

4. Assign x values to missing based on partner array

Untested code below:

data want;

set have;

by group_var;

array x(20) x1-x20;

array x_blank(20) xb1-xb20;

retain xb:;

if first.group_var then do i=1 to 20;

*create a 1/0 variable with 50% chance;

xb(i)=rand('bernoulli', 0.5);

end;

do i=1 to 20;

if xb(i)=1 then x(i)=.;

end;

run;

All Replies
Solution
‎09-26-2015 03:43 PM
Super User
Posts: 23,724

## Re: Randomly delete variables in a by variable group

Something like the following:

1. Set up an array for current X variables

2. Set up partner array to assign 1/0 using bernoulli random with 50% chance so 50% are missing

3. Set partner array only when first of group var

4. Assign x values to missing based on partner array

Untested code below:

data want;

set have;

by group_var;

array x(20) x1-x20;

array x_blank(20) xb1-xb20;

retain xb:;

if first.group_var then do i=1 to 20;

*create a 1/0 variable with 50% chance;

xb(i)=rand('bernoulli', 0.5);

end;

do i=1 to 20;

if xb(i)=1 then x(i)=.;

end;

run;

Occasional Contributor
Posts: 8

## Re: Randomly delete variables in a by variable group

[ Edited ]

This was a great place to start, Rezza! Thanks. I made a few changes and settled on the below. I have a user entered string of the variables they care about in the table, so I used that to make the arrays a little more dynamic.

%Macro Var_count;

%let var_count = %sysfunc(countw(&variables.));

%mend;

%Var_count;

data want (drop= i _: );

set have;

by group_var;

array x(*) &variables.;

array _xbl(*) _xb1-_xb&var_count.;

retain xb:;

call steaminit(321);

if first.group_var then do i=1 to &var_count;

*create a 1/0 variable with 50% chance;

xb(i)=rand('bernoulli', 0.5);

end;

do i=1 to &var_count;

if _xb(i)=1 then x(i)=.;

end;

run;

Thanks Again.

🔒 This topic is solved and locked.