Help using Base SAS procedures

sample selection

Accepted Solution Solved
Reply
Contributor
Posts: 61
Accepted Solution

sample selection

Hello, i would like to ask you about sample selection.

I want to select all different samples from data with no replacement.

i tried proc surveyselect with methods =srs , but i saw that some samples are same as previous, and my aim to select always different samples.

For example : my data is: 1,2,3,4,5

i want to choose all different samples, that size sample is 4, total different samples maybe : 5!/(4!*(5-4)!)=5

the samples will be :1,2,3,4 ; 1,2,3,5 ; 1,2,4,5;  1,3,4,5;   2,3,4,5

thank you

 


Accepted Solutions
Solution
‎04-25-2016 01:08 PM
Trusted Advisor
Posts: 1,117

Re: sample selection

Hi @AlexeyS,

AlexeyS wrote:

right, but the problem with allcomb function that is not so suitable for my data.

The good news is: This "problem" can be solved, as shown below.

data have;
do id=1 to 6;
  output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
ncomb=comb(dim(x), &k);
do sample=1 to ncomb;
  rc=allcomb(sample, &k, of x[*]);
  do i=1 to &k;
    id=x[i];
    output;
  end;
end;
keep sample id;
run;

View solution in original post


All Replies
Super User
Posts: 19,815

Re: sample selection

[ Edited ]

Your not drawing a sample here, you're generating all possible combinations. 

 

Take a look at allcomb function and routine. 

Contributor
Posts: 61

Re: sample selection

right, but the problem with allcomb function that is not so suitable for my data.

my data is look like : my variable is column vector and not row. from this column vector i want create all different samples, one below each other.

my data:

id

1

2

3

4

5

 

Solution
‎04-25-2016 01:08 PM
Trusted Advisor
Posts: 1,117

Re: sample selection

Hi @AlexeyS,

AlexeyS wrote:

right, but the problem with allcomb function that is not so suitable for my data.

The good news is: This "problem" can be solved, as shown below.

data have;
do id=1 to 6;
  output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
ncomb=comb(dim(x), &k);
do sample=1 to ncomb;
  rc=allcomb(sample, &k, of x[*]);
  do i=1 to &k;
    id=x[i];
    output;
  end;
end;
keep sample id;
run;
Contributor
Posts: 61

Re: sample selection

Posted in reply to FreelanceReinhard

thank you for your answers.

but i have now other problem, sometimes i have more than 33 variables, and allcomb function cannot work.

as i understood, the decision is call accomb function. but how can i use it?

 

my code with allcomb function :

 

 

data have;
do id=1 to 6;
  output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
ncomb=comb(dim(x), &k);
do sample=1 to ncomb;
  rc=allcomb(sample, &k, of x[*]);
  do i=1 to &k;
    id=x[i];
    output;
  end;
end;
keep sample id;
run;

 

Trusted Advisor
Posts: 1,117

Re: sample selection

Hi @AlexeyS,

 

You don't need CALL ALLCOMB, but CALL ALLCOMBI.

 

Example:

data have;
do id=1 to 34;
  output;
end;
run;

proc transpose data=have out=trans(drop=_:) prefix=x;
run;

%let k=4; /* sample size */

data want;
set trans;
array x x:;
array i[&k];
i[1]=0;
n=dim(x);
ncomb=comb(n, &k);
do sample=1 to ncomb;
  call allcombi(n, &k, of i[*]);
  do j=1 to &k;
    id=x[i[j]];
    output;
  end;
end;
keep sample id;
run;
Super User
Posts: 5,509

Re: sample selection

I assume that you have already exhausted the possibilities of PROC SURVEYSELECT, and it won't do what you need.  In that case, here's an approach the produces one large data set with all the samples in it.  There is a variable SAMPLE that distinguishes the contents of each sample.

 

data want;

do sample=1 to _nobs_;

   do recno=1 to _nobs_;

      if sample ne recno then do;

         set have point=sample nobs=_nobs_;

         output;

      end;

   end;

end;

run;

 

Of course the problem becomes more difficult if you are looking for samples of size 3 instead of samples of size "all but one".  For the "all but two" categories, you would have to add one more loop and check "if sample not in (recno, recno2) then do .. that's the reason for using point=sample rather than point=recno in the code above.

Respected Advisor
Posts: 3,799

Re: sample selection

 

%let n=5;
%let k=4;
%let ncomb=%sysfunc(comb(&n,&k));
proc plan ordered;
   factors sample=&ncomb id=&k of &n comb;
   output out=C&k.of&n;
   run;
   quit;

Capture.PNG

Super User
Posts: 10,030

Re: sample selection

Why not using ALLCOMB() ?

data _null_;
array x[5] (1 2 3 4 5);
n=dim(x);
k=4;
ncomb=comb(n,k);
do j=1 to ncomb;
rc=allcomb(j, k, of x[*]);
put j 5. +3 x1-x4 +3 rc=;
end;
run;


Respected Advisor
Posts: 3,799

Re: sample selection


Ksharp wrote:
Why not using ALLCOMB() ?

I reckon you didn't read the post from @FreelanceReinhard

Super User
Posts: 10,030

Re: sample selection

Posted in reply to data_null__
OH. John King, That would be easy by using a macro variable or an array to hold those data.

data have;
do id=1 to 6;
  output;
end;
run;
proc sql;
select count(*) into : n from have;
select id into : list separated by ' ' from have;
quit;
data _null_;
array x[&n] (&list);
n=dim(x);
k=4;
ncomb=comb(n,k);
do j=1 to ncomb;
rc=allcomb(j, k, of x[*]);
put j 5. +3 x1-x4 +3 rc=;
end;
run;
Respected Advisor
Posts: 3,799

Re: sample selection

 

My point is you are just repeating what was already shown earlier in the thread.

Ksharp wrote:
OH. John King, That would be easy by using a macro variable or an array to hold those data.
 
Super User
Posts: 10,030

Re: sample selection

Posted in reply to data_null__
John King, Never mind. Just leave one more choice to let OP choose .
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 687 views
  • 5 likes
  • 6 in conversation