Help using Base SAS procedures

How to create several datasets sampled from one column

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

How to create several datasets sampled from one column

I have a dataset as follows (example)

y      Groupk    code
10111
11122
12133
13214
14225
15236

I want to create several datasets of size 2 where I pick one observation from group 1 and 2 to form one dataset e.g.;

data 1

Y group k code

11  1    1   1

13  2    1   4

data 2,

y group k code

11  1    1  1

14  2    2  5

and so forth.

Currently I have been doing this using proc sql  (e.q., create table data1 as select *,1 as id from sample where code eq 1 or  code eq 4) and the challenge is that when the levels of groups and k increase it is nearly impossible to do all these combinations by hand. Is there anyone who can help me in writing a simple code that will pick the required values without having to specify the code?


Accepted Solutions
Solution
‎09-25-2014 08:32 AM
Super User
Super User
Posts: 7,942

Re: How to create several datasets sampled from one column

Update to:

data have;

  infile datalines;

  input y Group k code;

datalines;

10 1 1 1

11 1 2 2

12 1 3 3

13 2 1 4

14 2 2 5

15 2 3 6

;

run;

data _null_;

  ind=1;

  do i=1 to 3;

    do j=4 to 6;

      call execute('proc sql;

                      create table DATA'||strip(put(ind,best.))||' as

                      select  *,

                              '||strip(put(ind,best.))||' as ID

                      from    WORK.HAVE

                      where   CODE='||strip(put(i,best.))||'

                        or    CODE='||strip(put(j,best.))||';

                    quit;');

      ind=ind+1;

    end;

  end;

run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,942

Re: How to create several datasets sampled from one column

Hi,

Well not entirely sure on your logic, but the below code will loop over and generate a proc sql based on the loop.  You can modify this to your particular logic:

data have;
  infile datalines;
  input y Group k code;
datalines;
10 1 1 1
11 1 2 2
12 1 3 3
13 2 1 4
14 2 2 5
15 2 3 6
;
run;

proc sort data=have out=loop (obs=1);
  by descending code;
run;

data _null_;
  set loop;
  do i=2 to code;
    call execute('proc sql;
                    create table DATA'||strip(put(i-1,best.))||' as
                    select  *,
                            '||strip(put(i-1,best.))||' as ID
                    from    WORK.HAVE
                    where   CODE=1
                      or    CODE='||strip(put(i,best.))||';
                  quit;');
  end;
run;

SAS Employee
Posts: 232

Re: How to create several datasets sampled from one column

Hi there, this question was asked in the About SAS Communities space where we post help info about the community. No problem, though. I'm moving it to SAS Procedures where I think it goes. Thanks for joining the community and asking a question! Smiley Happy Thanks RW9 for your response, too...

Occasional Contributor
Posts: 9

Re: How to create several datasets sampled from one column

Thank you RW9.

It's almost there but not quite there yet Smiley Happy

Overall, I will have 9 datasets with the following levels of the 'code' variable:

data1: code 1 & 4

data2: code 1 & 5

data3: code 1 & 6

data4: code 2 & 4

data5: code 2 & 5

data6: code 2 & 6

data7: code 3 & 4

data8: code 3 & 5

data9: code 3 & 6

Thus, each set contains data points from each group.

Sorry about that Communities Admin Smiley Happy

Solution
‎09-25-2014 08:32 AM
Super User
Super User
Posts: 7,942

Re: How to create several datasets sampled from one column

Update to:

data have;

  infile datalines;

  input y Group k code;

datalines;

10 1 1 1

11 1 2 2

12 1 3 3

13 2 1 4

14 2 2 5

15 2 3 6

;

run;

data _null_;

  ind=1;

  do i=1 to 3;

    do j=4 to 6;

      call execute('proc sql;

                      create table DATA'||strip(put(ind,best.))||' as

                      select  *,

                              '||strip(put(ind,best.))||' as ID

                      from    WORK.HAVE

                      where   CODE='||strip(put(i,best.))||'

                        or    CODE='||strip(put(j,best.))||';

                    quit;');

      ind=ind+1;

    end;

  end;

run;

Super User
Posts: 10,020

Re: How to create several datasets sampled from one column

Are there just two groups ?

data have;
  infile datalines;
  input y Group k code;
datalines; 
10 1 1 1 
11 1 2 2  
12 1 3 3 
13 2 1 4 
14 2 2 5 
15 2 3 6 
;
run;

data x(drop=n );
 set have end=last;
 length c $ 20;
 retain n;
 if group ne lag(group) then n=_n_;
if last then do;
 do i=1 to n-1;
  do j=n to _n_;
   key+1;
   set have point=i;c=catx('_',c,code);output;     
   set have point=j;c=catx('_',c,code);output; call missing(c);
end;
end;
stop;
end;
run;


data _null_;
 if _n_ eq 1 then do;
 if 0 then set x;
  declare hash ha(multidata:'y');
   ha.definekey('key');
   ha.definedata('y', 'Group', 'k', 'code');
   ha.definedone();
 end;
set x;
by key;
ha.add();
if last.key then do;ha.output(dataset: catx('_','code',c));ha.clear();end;
run;

Xia Keshan

Occasional Contributor
Posts: 9

Re: How to create several datasets sampled from one column

Hi Xia Keshan,

Yes, in the current example it is only two, but this was just for simplification. In my current data the number of groups and repetitions (k) vary but I was able to extend the code of RW9 (by including additional do loops) to accommodate my settings. I believe the same can be done with your code.

Thanks to the both of you (Keshan and RW9), you've been very helpful!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 231 views
  • 6 likes
  • 4 in conversation