BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mamzolo
Calcite | Level 5

I have a dataset as follows (example)

y      Groupk    code
10111
11122
12133
13214
14225
15236

I want to create several datasets of size 2 where I pick one observation from group 1 and 2 to form one dataset e.g.;

data 1

Y group k code

11  1    1   1

13  2    1   4

data 2,

y group k code

11  1    1  1

14  2    2  5

and so forth.

Currently I have been doing this using proc sql  (e.q., create table data1 as select *,1 as id from sample where code eq 1 or  code eq 4) and the challenge is that when the levels of groups and k increase it is nearly impossible to do all these combinations by hand. Is there anyone who can help me in writing a simple code that will pick the required values without having to specify the code?

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Update to:

data have;

  infile datalines;

  input y Group k code;

datalines;

10 1 1 1

11 1 2 2

12 1 3 3

13 2 1 4

14 2 2 5

15 2 3 6

;

run;

data _null_;

  ind=1;

  do i=1 to 3;

    do j=4 to 6;

      call execute('proc sql;

                      create table DATA'||strip(put(ind,best.))||' as

                      select  *,

                              '||strip(put(ind,best.))||' as ID

                      from    WORK.HAVE

                      where   CODE='||strip(put(i,best.))||'

                        or    CODE='||strip(put(j,best.))||';

                    quit;');

      ind=ind+1;

    end;

  end;

run;

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well not entirely sure on your logic, but the below code will loop over and generate a proc sql based on the loop.  You can modify this to your particular logic:

data have;
  infile datalines;
  input y Group k code;
datalines;
10 1 1 1
11 1 2 2
12 1 3 3
13 2 1 4
14 2 2 5
15 2 3 6
;
run;

proc sort data=have out=loop (obs=1);
  by descending code;
run;

data _null_;
  set loop;
  do i=2 to code;
    call execute('proc sql;
                    create table DATA'||strip(put(i-1,best.))||' as
                    select  *,
                            '||strip(put(i-1,best.))||' as ID
                    from    WORK.HAVE
                    where   CODE=1
                      or    CODE='||strip(put(i,best.))||';
                  quit;');
  end;
run;

Community_Help
SAS Employee

Hi there, this question was asked in the About SAS Communities space where we post help info about the community. No problem, though. I'm moving it to SAS Procedures where I think it goes. Thanks for joining the community and asking a question! Smiley Happy Thanks RW9 for your response, too...

mamzolo
Calcite | Level 5

Thank you RW9.

It's almost there but not quite there yet Smiley Happy

Overall, I will have 9 datasets with the following levels of the 'code' variable:

data1: code 1 & 4

data2: code 1 & 5

data3: code 1 & 6

data4: code 2 & 4

data5: code 2 & 5

data6: code 2 & 6

data7: code 3 & 4

data8: code 3 & 5

data9: code 3 & 6

Thus, each set contains data points from each group.

Sorry about that Communities Admin Smiley Happy

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Update to:

data have;

  infile datalines;

  input y Group k code;

datalines;

10 1 1 1

11 1 2 2

12 1 3 3

13 2 1 4

14 2 2 5

15 2 3 6

;

run;

data _null_;

  ind=1;

  do i=1 to 3;

    do j=4 to 6;

      call execute('proc sql;

                      create table DATA'||strip(put(ind,best.))||' as

                      select  *,

                              '||strip(put(ind,best.))||' as ID

                      from    WORK.HAVE

                      where   CODE='||strip(put(i,best.))||'

                        or    CODE='||strip(put(j,best.))||';

                    quit;');

      ind=ind+1;

    end;

  end;

run;

Ksharp
Super User

Are there just two groups ?

data have;
  infile datalines;
  input y Group k code;
datalines; 
10 1 1 1 
11 1 2 2  
12 1 3 3 
13 2 1 4 
14 2 2 5 
15 2 3 6 
;
run;

data x(drop=n );
 set have end=last;
 length c $ 20;
 retain n;
 if group ne lag(group) then n=_n_;
if last then do;
 do i=1 to n-1;
  do j=n to _n_;
   key+1;
   set have point=i;c=catx('_',c,code);output;     
   set have point=j;c=catx('_',c,code);output; call missing(c);
end;
end;
stop;
end;
run;


data _null_;
 if _n_ eq 1 then do;
 if 0 then set x;
  declare hash ha(multidata:'y');
   ha.definekey('key');
   ha.definedata('y', 'Group', 'k', 'code');
   ha.definedone();
 end;
set x;
by key;
ha.add();
if last.key then do;ha.output(dataset: catx('_','code',c));ha.clear();end;
run;

Xia Keshan

mamzolo
Calcite | Level 5

Hi Xia Keshan,

Yes, in the current example it is only two, but this was just for simplification. In my current data the number of groups and repetitions (k) vary but I was able to extend the code of RW9 (by including additional do loops) to accommodate my settings. I believe the same can be done with your code.

Thanks to the both of you (Keshan and RW9), you've been very helpful!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1207 views
  • 6 likes
  • 4 in conversation