BookmarkSubscribeRSS Feed
Learning_S
Obsidian | Level 7

Hello everyone, my question is similar to this one below but with one difference: the constraint affects the random sample https://communities.sas.com/t5/SAS-Procedures/PROC-SURVEYSELECT-with-Constraints/td-p/51235

 

Example

 

Sel_1

3

2

4

1

2

 

So for Sel_2, I would like to randomly select 1-4 with 2 chosen twice such that Sel_1 NE Sel_2

 

Sel_1   Sel_2 (acceptable since none match)

3          2

2          4

4          2

1          3

2          1

 

Is there a way to conduct this? Using the where statement doesn't make sense as the where statement requires an actual variable in the dataset and it can limit the size of the resulting sampling. If not, I was considering some sort of recursive method by changing the seed number until that condition is met. Anything helps and thank you so much for reading.

 

EDIT: The main result desired is that the new variable (Sel_2) follows two constraints

 

1. Sel_2 does not match Sel_1 in the same observation

2. The number of the groups (in the previous example, 1-4) is controlled through a column. (This should also be followed by Sel_1)

 

i.e. 1 shows up once, 2 shows up twice, 3-4 show up once for Sel_1 and Sel_2 while following the first constraint

5 REPLIES 5
Learning_S
Obsidian | Level 7

EDIT: Ignore as the post was edited.

 

Resending the example since the formatting was not followed:

 

Example

 

Selection_1

3

2

4

1

2

 

So for Selection_2, I would like to randomly select 1-4 with 2 chosen twice such that Selection_1 NE Selection_2

Selection_1 Selection_2 (acceptable since none match)

3                  2

2                  4

4                  2

1                  3

2                  1

ballardw
Super User

That looks more like something for Proc Plan.

Partial from the documentation:

PLAN procedure constructs designs and randomizes plans for factorial experiments, especially nested and crossed experiments and randomized block designs. PROC PLAN can also be used for generating lists of permutations and combinations of numbers. The PLAN procedure can construct the following types of experimental designs:

  • full factorial designs, with and without randomization

 

You might describe how you intend to use this result.

Learning_S
Obsidian | Level 7

Thanks for the help! I intend to use it as follows: have an ordered id and then have three following measurements that are randomized. The following code that I wrote down does something similar but would like to control the number of "drugs" and how many times each one can show up per level.

 

I generated this code and got this result

 

data dat;
    do id=101 to 108;
    output;
    end;
run;

proc plan seed=27371;
   factors id=8 ordered Drug=3;
   output data=dat out=plan;
run;

 

SAS Output

id Drug
1213
2123
3312
4312
5132
6231
7132
8321

 

Which is close to what I'd like but I would like to know how to control for the number of times each drug shows up per level and how to increase the number of drugs without increasing the levels.

 

e.g Have 6 drugs, 1,2,3,4 show up 1 time but 5,6 show up twice per level 

 

PS I can't seem to open out=plan

ballardw
Super User

Proc Plan since it has an interactive behavior possible is like Proc SQL and Datasets and uses quit; to indicate you are actually finished with the procedure.

 

 

You'll have to clarify "Which is close to what I'd like but I would like to know how to control for the number of times each drug shows up per level and how to increase the number of drugs without increasing the levels." Levels of what?

 

Is this closer?

proc plan  seed=27371 ;
   factors id=8 ordered  Drug=3 of 5;
   
run;
quit;

Notice that you were getting an error from your data set because there was no value for drug.

 

Learning_S
Obsidian | Level 7

That's definitely closer, thanks!


@ballardw wrote:

 

You'll have to clarify "Which is close to what I'd like but I would like to know how to control for the number of times each drug shows up per level and how to increase the number of drugs without increasing the levels." Levels of what? 


 

I apologize for my poor wording. Let's return to the output that I had.

 

id Drug

1213
2123
3312
4312
5132
6231
7132
8321

 

In the second column, 1 shows up 3 times, 2 shows up 2 times, and 3 shows up 3 times

This sum does not match the fourth column. (Coincidentally, it does for the third)

 

I would like to define saying 1 has to show up 3 times across the second, third, and fourth column without being in the same observation (ditto for 2 and 3 barring size).

 

Using PROC SURVEYSELECT as an example, I would like to control the size of the groups such as when using the option groups=(num1, num2, ..., num n) for PROC SELECTSURVEY. Essentially, these two PROCs each do something that I'd like to accomplish in just one operation (PROC PLAN for random ordering without a number repeating across an observation with multiple column creation) (PROC SURVEYSELECT for controlling size of each group)

 

I hope this explanation is much clearer.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2000 views
  • 2 likes
  • 2 in conversation