## sample selection

Solved
Frequent Contributor
Posts: 75

# sample selection

I want to select all different samples from data with no replacement.

i tried proc surveyselect with methods =srs , but i saw that some samples are same as previous, and my aim to select always different samples.

For example : my data is: 1,2,3,4,5

i want to choose all different samples, that size sample is 4, total different samples maybe : 5!/(4!*(5-4)!)=5

the samples will be :1,2,3,4 ; 1,2,3,5 ; 1,2,4,5;  1,3,4,5;   2,3,4,5

thank you

Accepted Solutions
Solution
‎04-25-2016 01:08 PM
Posts: 1,252

## Re: sample selection

Hi @AlexeyS,

AlexeyS wrote:

right, but the problem with allcomb function that is not so suitable for my data.

The good news is: This "problem" can be solved, as shown below.

``````data have;
do id=1 to 6;
output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
ncomb=comb(dim(x), &k);
do sample=1 to ncomb;
rc=allcomb(sample, &k, of x[*]);
do i=1 to &k;
id=x[i];
output;
end;
end;
keep sample id;
run;``````

All Replies
Super User
Posts: 23,733

## Re: sample selection

[ Edited ]

Your not drawing a sample here, you're generating all possible combinations.

Take a look at allcomb function and routine.

Frequent Contributor
Posts: 75

## Re: sample selection

right, but the problem with allcomb function that is not so suitable for my data.

my data is look like : my variable is column vector and not row. from this column vector i want create all different samples, one below each other.

my data&colon;

id

1

2

3

4

5

Solution
‎04-25-2016 01:08 PM
Posts: 1,252

## Re: sample selection

Hi @AlexeyS,

AlexeyS wrote:

right, but the problem with allcomb function that is not so suitable for my data.

The good news is: This "problem" can be solved, as shown below.

``````data have;
do id=1 to 6;
output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
ncomb=comb(dim(x), &k);
do sample=1 to ncomb;
rc=allcomb(sample, &k, of x[*]);
do i=1 to &k;
id=x[i];
output;
end;
end;
keep sample id;
run;``````
Frequent Contributor
Posts: 75

## Re: sample selection

but i have now other problem, sometimes i have more than 33 variables, and allcomb function cannot work.

as i understood, the decision is call accomb function. but how can i use it?

my code with allcomb function :

``````data have;
do id=1 to 6;
output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
ncomb=comb(dim(x), &k);
do sample=1 to ncomb;
rc=allcomb(sample, &k, of x[*]);
do i=1 to &k;
id=x[i];
output;
end;
end;
keep sample id;
run;``````

Posts: 1,252

## Re: sample selection

Hi @AlexeyS,

You don't need CALL ALLCOMB, but CALL ALLCOMBI.

Example:

``````data have;
do id=1 to 34;
output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
array i[&k];
i[1]=0;
n=dim(x);
ncomb=comb(n, &k);
do sample=1 to ncomb;
call allcombi(n, &k, of i[*]);
do j=1 to &k;
id=x[i[j]];
output;
end;
end;
keep sample id;
run;``````
Super User
Posts: 6,777

## Re: sample selection

I assume that you have already exhausted the possibilities of PROC SURVEYSELECT, and it won't do what you need.  In that case, here's an approach the produces one large data set with all the samples in it.  There is a variable SAMPLE that distinguishes the contents of each sample.

data want;

do sample=1 to _nobs_;

do recno=1 to _nobs_;

if sample ne recno then do;

set have point=sample nobs=_nobs_;

output;

end;

end;

end;

run;

Of course the problem becomes more difficult if you are looking for samples of size 3 instead of samples of size "all but one".  For the "all but two" categories, you would have to add one more loop and check "if sample not in (recno, recno2) then do .. that's the reason for using point=sample rather than point=recno in the code above.

Posts: 3,852

## Re: sample selection

``````%let n=5;
%let k=4;
%let ncomb=%sysfunc(comb(&n,&k));
proc plan ordered;
factors sample=&ncomb id=&k of &n comb;
output out=C&k.of&n;
run;
quit;``````

Super User
Posts: 10,784

## Re: sample selection

Why not using ALLCOMB() ?
```
data _null_;
array x[5] (1 2 3 4 5);
n=dim(x);
k=4;
ncomb=comb(n,k);
do j=1 to ncomb;
rc=allcomb(j, k, of x[*]);
put j 5. +3 x1-x4 +3 rc=;
end;
run;

```
Posts: 3,852

## Re: sample selection

Ksharp wrote:
Why not using ALLCOMB() ?

I reckon you didn't read the post from @FreelanceReinhard

Super User
Posts: 10,784

## Re: sample selection

OH. John King, That would be easy by using a macro variable or an array to hold those data.
```
data have;
do id=1 to 6;
output;
end;
run;
proc sql;
select count(*) into : n from have;
select id into : list separated by ' ' from have;
quit;
data _null_;
array x[&n] (&list);
n=dim(x);
k=4;
ncomb=comb(n,k);
do j=1 to ncomb;
rc=allcomb(j, k, of x[*]);
put j 5. +3 x1-x4 +3 rc=;
end;
run;
```
Posts: 3,852

## Re: sample selection

My point is you are just repeating what was already shown earlier in the thread.

Ksharp wrote:
OH. John King, That would be easy by using a macro variable or an array to hold those data.
` `
Super User
Posts: 10,784